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THE ACTIVITY MOVEMENT 


By Crype Hissonc 


In an attempt to overcome the weaknesses of the traditional school 
organization many progressive schools have developed new programs. 
These programs are so similar in character that collectively the 
changes have been referred to as the activity movement. This 
movement has claimed the center of the educational stage for a length 
of time sufficient to have engendered widespread interest in its out- 
comes and in its basic philosophy. 

In Doctor Hissong’s study an attempt has been made to discover 
the principles underlying the present activity movement, to determine 
the influence of traditional concepts in shaping the trends of the 
movement, and to see if in the light of the present knowledge of the 
child and his relation to his environment the movement rests upon a 
justifiable basis. 


62.00 plus 10¢ postage 


WARWICK AND YORK 


Publishers BALTIMORE 
















MOND 


rsity 


forei } 
1. Bac 
re yea 


T wees 
ims 10 
cS alte 


Mant 
ditori 


MD 


ol 
8. 
he 


118 





THE JOURNAL OF 
EDUCATIONAL PSYCHOLOGY 


Volume XXIX February, 1938 Number 2 


— 
_-- — — —--—~ —- 














TEST RELIABILITY AS A FUNCTION OF METHOD OF 
COMPUTATION! 


H. H. REMMERS AND LAURENCE WHISLER 
Purdue University 


I. THE PROBLEM 


It requires no extended brief to demonstrate the fundamental 
importance of measurement in any science. And when measure- 
ments are made in any science two types of questions naturally 
arise: How much trust can be put in the accuracy of the measure- 
ments, and how pertinent are the measurements to solution of 
human problems. A contemporary writer states, ‘‘the cogency of 
any conclusion about the range of human capacities depends not only 
upon the validity of our concepts of measurements, but also upon 
the reliability of the data to which they are applied.”? A second 
truism is that due to the variety of abilities, attitudes, and actions, 
the problems of accurate and pertinent human measurement are by no 
means simple. 

The present paper is concerned with the current confusion of the 
concept of test reliability. It will note wherein the confusion lies, will 
provide experimental data to show that, operationally defined, the 
coefficient of “‘reliability’’ or ‘‘self-correlation”’ of a test yields differ- 
ent results for different methods of computing “reliability ’’; and finally 
it will attempt to clarify the entire concépt. 

At the outset it may be well to consider the most general purposes of 
measurement. They are (1) to discover qualities or characteristics 
and their amounts, and (2) to establish functional relationships among 
these qualities and characteristics. 


1 This study has been aided by a grant from the Josiah Macy, Jr., Foundation. 


2 Wechsler, David. The Range of Human Capacity, Williams and Wilkins 
Company, 1935, p. 4. 
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The investigator studying the “reliabilities”? of psychological 
measurements is exposed to numerous pitfalls. Notable among these 
pitfalls are the various factors which may affect the coefficient of 
correlation. A few examples from the scientific literature of psychology 
may illustrate some of these pitfalls. A few years ago a professor of 
psychology published a study of the relationships among the mental 
functions measured by certain of the tests in Whipple’s Manual of 
Physical and Mental Tests. He applied these tests to a class of graduate 
students in psychology, computed the inter-correlations of the tests and 
concluded that these various mental functions had little or no functional 
dependence upon each other. His conclusions were for some of the 
tests at least almost certainly diametrically opposite of the truth of the 
matter. The explanation of the lack of correlation lay in the fact that 
tests of word building, etc., when applied to such a homogeneous group 
of students yielded ‘‘reliabilities”’ so low as largely to preclude the 
possibility of any correlation. If the sampling had included the total 
range of talent in the literate population, relatively high ‘‘reliabilities”’ 
and inter-correlations would doubtless have been found. 

Another illustration is that of a recently published group mental 
test which the authors state correlates .8 plus with the Stanford Revi- 
sion of the Binet-Simon Test when corrected for attenuation. But 
since test “‘reliability’’ was computed by the split-test or odd-evens 
method, it is almost certain that a higher “‘true reliability’ would have 
been obtained if the ‘‘reliability’”’ had been computed by the equivalent 
forms method. Experimental data to be presented in this paper bear 
directly on this problem. 


II. THEORETICAL 


There are in general three operational definitions of self-correlation 
or “‘reliability.”” In each method two series of test scores, made by the 
same persons, are correlated. In the test-retest procedure the items in 
the tests are identical. There is a time interval between tests. In the 
equivalent forms procedure the items in the tests are different but have 
been selected to be comparable in difficulty and in content. There isa 
time interval between tests. When one form has been given immedi- 
ately after the other it has generally been assumed that the time interval 
is of no consequence. We shall examine this assumption. The third 
procedure is the split-half or odds-evens. In this, the subtests are 
random halves of a test which is given as a single unit. Since the sub- 
tests are composed of alternate items, the time interval between sub- 
tests is practically nil. When the self-correlations of parts of the test 
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have been obtained, the ‘‘reliability’’ of the whole is predicted by the 
Spearman-Brown formula. 

Each of the three methods is advocated by psychologists and 
statisticians of standing as the method for determining test “‘ reliability.” 
Retesting with the same test ‘‘after a period of time sufficiently long to 
counteract effect from practice, and before there could be any consider- 
able change in the subjects with regard to the trait tested”’ is advocated 
by Paterson,' et al. But this proposal does not seem useful. The 
necessary time interval is indeterminate. Estimation of the proper 
test-retest interval would necessarily be highly subjective. 

A more recent article? concludes that ‘‘the only technique whereby 
one can approximate constancy of conditions in the subjects in finding 
test reliability is that of correlating scores on odd and evenitems. The 
effects of variations in the subjects during even the short period of the 
test tend to be equalized by the temporal arrangement of odd and even 
items. This method seems, therefore, to give most nearly the reliabil- 
ity of the measuring instrument, free from extraneous changes.”’ 

Finally Dunlap* states ‘‘The true score of the Spearman-Brown 
formula in the odds-even technique is the ‘true’ ability at the instant 
while the ‘true’ scores of the inter-correlation of two forms is the true 
underlying ability or average ability of the subject.” 

The thesis of this paper is that different methods of computing 
“reliability”’ give different results, that the range of applicability of 
each is limited, and that the choice of method depends on the purpose 
and conditions of the investigation. A similar opinion has been stated 
by Goodenough,‘ ‘‘ What we should do, I think, is to relegate the use of 
the term ‘reliability’ to the limbo of outworn concepts and express our 
results in terms of the actual procedure used. It is quite as easy to 
speak of ‘correlation between test and retest’ after a stated interval, or 
‘between the sums of alternate items,’ or ‘between equivalent forms of 


a test’ as it is...to use... the less accurate expression 
‘reliability.’ ”’ 





1 Paterson, D. G., Elliott, R. N. Anderson, L. D., Toops, H. A., and Heid- 
breder, E.: Minnesota Mechanical Ability Tests, University of Minnesota, 1930. 

? Anastasi, Anne: ‘‘The Influence of Practice upon Test Reliability,’”’ Journal 
of Educational Psychology, Vol. XXV, May, 1934, pp. 321-335. 

Dunlap, Jack W.: ‘‘Comparable Tests and Reliability.” Journal of Educa- 
tional Psychology, Vol. XXIV, Sept. 1933, pp. 442-453. 

4 Goodenough, Florence L.: ‘‘A Critical Note on the Use of the Term Reliability 
in Mental Measurement.” Journal of Educational Psychology, Vol. XXVII, 
March, 1936, pp. 173-178. 
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Before citing the evidence that ‘“‘reliability”’ is an unsuitable 
concept for scientific psychology, an inquiry into the meaning of the 
word seems pertinent. A common fallacy is that all things having the 
same name are functionally similar. But words have both a referential 
and an emotive value. The most useful words for scientific discourse 
are those with a high referential and a ‘low emotive value, 7.e., words 
with a precise denotation and relatively free from a “‘good” or “‘bad”’ 
connotation. But ‘‘reliability”’ is a word which is associated with 
‘good,’ “‘ valuable,” etc.; it is pleasant to think ones tests are “‘relia- 
ble.” The two statements, the test has a high odds-even self-correla- 
tion, .90, and that the test has a high odds-even “‘reliability,”’ are 
identical except that the second not only states a result but also carries 
the implication that the test is applicable in other situations. But 


unless the ‘‘results’’ are kept clearly separate from the “implications” - 


it is difficult to keep from thinking and acting as if a ‘‘reliable”’ test has 
some intrinsic virtue which will inevitably show up in future applica- 
tions of the test. 

Wat are the systematic differences in the results obtained by the 
three methods of computing ‘reliability’? or ‘‘self-correlation’’? 
Goodenough found that for a period of six weeks, the test-retest self- 
correlation of the Kuhlmann-Binet applied to pre-school children was 
lower than the correlation between the sums of the scores on alternate 
items (t.e. odds-evens method).! Lanier,? who gave intelligence, 
mechanical, and sensory tests, concluded from his results, ‘‘ In general 
when the predicted and actual reliabilities were secured by correlating 
the subdivisions of a single test, the agreement between them was 
much closer than when correlations were of parts of two separate 
applications of the test. This indicates that the attitude of the sub- 
ject, which may be assumed to be more or less constant throughout one 
sitting, is an important factor in determining a test’s reliability.”’ 

There is one factor in the test-retest situation which should tend to 
make for a high correlation, that is the presence of the response error, 
the likelihood of the subject repeating previous responses, however 
“wrong” they may be. But there are other factors which tend to 
make for a low correlation. These exist noticeably not only in the 
test-retest procedure but also in the equivalent forms procedure. 
When the two parts which are correlated differ in time, such factors 
as differences in interest, in fatigue, in distractions, and in range of 
relevant associations, may and usually do exist. A test should not be 





1 Goodenough, Florence L.: Op. cit. 
? Lanier, L. H.: ‘‘ Prediction of Reliabilities of Mental Tests and Tests of Special 
Abilities.” Journal of Experimental Psychology, Vol. X, 1937, pp. 69-113. 


et oe 6. ee. Se. gm. 6! ee. ee: oe | ee 
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thought of as a fixed thing, but as a test in relation to a situation. 
Capacities are not definite quantities, but are always capacities in a 
particular momentary situation. Similarly, but more obviously with 
attitudes, a measurable attitude is always an attitude in relation to a 
given momentary situation. These changes in measurable character- 
istics with the passage of time, give point to the caution made by 
Thouless! that one should avoid confusing “‘test unreliability’ with the 
“effects of function fluctuation.” 

For any self-correlation, the correlated parts should be comparable 
in content and in difficulty. This relation is obtained by arrangement 
in the construction of equivalent forms, and by random selection in 
constructing odds-evens tests. It may be assumed there is no constant 
difference in comparability of content and difficulty between odds- 
evens and equivalent form tests. But there is a time difference. And 
there is a consistent tendency for equivalent form self-correlations to 
be lower than odds-evens. In a recent study, Jordan? gave Forms A 
and B of a scholastic aptitude test to two hundred ten high-school 
pupils and to two hundred ten college students. Comparing odds- 
evens and equivalent forms self-correlations, equated for length by the 
Spearman-Brown formula, he found that in the fourteen obtained 
comparisons of equivalent form with odds-evens coefficients, the latter 
was higher in every case, and significantly.so (difference divided by the 
standard error of the difference three or more) in eight of the fourteen 
cases, while in none of the comparisons of equivalent forms with 
equivalent forms and of odds-evens with odds-evens did significant 
differences occur. 


III, EXPERIMENTAL 


At this point we may consider the experimental data bearing on our 
present problem. In a previously published paper* it was pointed out 
that the calculation of the true correlation is subject to systematic error 
in that different methods of determining reliability will yield different 
values of the reliability coefficient. This is readily apparent from the 
formula for correction for attenuation.‘ 





1 Thouless, R. H.: ‘‘Test Unreliability and Function Fluctuation.” Journal 
of Psychology, Vol. XXVI, 1936, pp. 323-343. 

? Jordan, R. C.: ‘An Empirical Study of the Reliability Coefficient.” Journal 
of Educational Psychology, Vol. XX VI, September, 1935, pp. 416-427. 

*Remmers, H. H.: “‘A Possible Experimental Error in Determining the 
Overlap of Two Correlated Variables.” Journal of General Psychology, Vol. IX, 
1933, pp. 459-461. 

* Kelley, T. L.: Statistical Method, Macmillan, 1923, pp. 204-205. 
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Obviously the true r will vary as a function of 
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the reliability values in the denominator of the fraction. 

To obtain some notion of the magnitude of this systematic error in a 
number of defined situations the experiment for which data are shown 
in Tables I and II and Fig. 1 was set up. 

In four different instructional departments of Purdue University, 
i.e., Education, Applied Mechanics, Mathematics, and Physics, objec- 
tively scored examinations were prepared so as to yield two forms as 
nearly similar in content and difficulty as the judgment of the staff 
members of these departments could achieve.! Form 1 was admin- 
istered first to about half the subjects, "orm 2 to the others. The 
examinations were analyzed for reliability according to the schema 
shown in the second column of Table I. The resulting coefficients of 
correlation are shown in the third column of this table. 

The odd-even coefficients for Form 1 and for Form 2 are the “relia- 
bilities”’ of one-fourth of the total examination. The odd-even coeffi- 
cient for Form 1 plus 2 and the coefficient for the equivalent forms are 
the ‘‘reliabilities” of one-half of the total examination. The “relia- 
bilities”? for the total examinations, as calculated by the Spearman- 
Brown formula, are shown in the last column of Table I. 

Table II shows how method of analysis affects ‘‘reliability.”” Two 
of the values, the ‘‘reliabilities”’ of half the test for Form 1 vs. Form 2, 
and Odds vs. Evens, Form 1 plus 2, are averages from the third coluran 
of Table I. The other coefficients were obtained by applying the 
Spearman-Brown formula to the averages obtained from the third 
column of Table I. Figure 1 presents the data in Table II graphically. 

Table III also shows how method of analysis affects reliabilities. 
The calculations are based on the predicted “reliabilities ”’ of the whole 
test given in the last column of Table I. For each test, the coefficient 
predicted from odds-evens, Form 1 plus 2, is taken as the standard. 





1T acknowledge with pleasure the cordial codperation of my colleagues in the 
departments named above. To credit them all by name would require too much 
space, but I am especially indebted to Professor William Marshall, Head of the 
Mathematics Department and Professor Laurence Hadley also of this Department; 
to Professor R. B. Abbott in charge of the Engineering Physics instruction; and 
to Professor A. P. Poorman of the Department of Applied Mechanics and to Dean 
R. G. Dukes, Dean of the Graduate School and Head of the Department of 
Applied Mechanics. H.H.R. 
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TaB_E I.—ComPaRIsSON OF Four DiFreERENT METHODS OF DETERMINING RELIA- 
BILITY FOR THE SAME TEsT IN DiFFERENT SuBJECTS 


























r of total test 
Subject r Spearman- 
Brown 

Psychology, test a Form 1 vs. 2 .595 + .044) .746 + .020! 
N = 100 Odds-evens, Form 1 plus 2|.677 + .037| .807 + .014 
Test time = 50 minutes | Odds-evens, Form 1 .603 + .043) .859 + .006 
Test items = 120 Odds-evens, Form 2 .494 + .051| .796 + .009 
Psychology, test 6b Form 1 vs. 2 .519 + .050) .684 + .025 
N = 100 Odds-evens, Form 1 plus 2}.713 + .033] .831 + .012 
Test time = 50 minutes | Odds-evens, Form 1 .464 + .053| .776 + .010 
Test items = 108....... Odds-evens, Form 2 .640 + .040) .877 + .005 
Psychology, test c Form 1 vs. 2 .437 + .055) .608 + .033 
N = 100 Odds-evens, Form 1 plus 2).639 + .040) .780 + .017 
Test time = 50 minutes | Odds-evens, Form 1 .661 + .038) .886 + .004 
Test items = 106 Odds-evens, Form 2 .411 + .056| .736 + .012 
Applied mechanics, test a} Form 1 vs. 2 .361 + .038| .530 + .027 
N = 233 Odds-evens, Form 1 plus 2}.559 + .030) .717 + .015 
Test time = 50 minutes | Odds-evens, Form 1 .377 + .038) .708 + .009 
Test items = 110 Odds-evens, Form 2 .467 + .035| .778 + .006 
Physics, test a Form 1 vs. 2 .742 + .017| .852 + .006 
N = 300 Odds-evens, Form 1 plus 2|.765 + .016| .867 + .006 
Test time = 110 minutes | Odds-evens, Form 1 .620 + .024| .867 + .003 
Test items = 146 Odds-evens, Form 2 .600 + .025) .857 + .003 
Physics, test b Form 1 vs. 2 .752 + .017|) .858 + .008 
N = 300 Odds-evens, Form 1 plus 2|.784 + .015| .879 + .005 
Test time = 110 minutes | Odds-evens, Form 1 .702 + .020) .904 + .002 
Test items = 134 Odds-evens, Form 2 .652 + .022) .882 + .003 
Analytical geometry Form 1 vs. 2 .680 + .019] .810 + .007 
N = 359 Odds-evens, Form 1 plus 2}.761 + .015) .864 + .005 
Test time = 50 minutes | Odds-evens, Form 1 .624 + .022) .869 + .003 
Test items = 80 Odds-evens, Form 2 .665 + .020) .888 + .002 














1 These probable errors were calculated by Shen’s formula, 


.6745a(1 — r?) 





PEz = 


V/N(1 + (a — 1)r}? 
See his article, ‘‘A Note on the Standard Error of the Spearman-Brown Formula,” 
Journal of Educational Psychology, Vol. XVII, February, 1926, pp. 93-94. 
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If this is larger than the coefficient with which it is compared the differ- 
ence is marked plus (++). 

Tables II and III show clearly that the “reliability” is directly 
related to the meth««. of determination. The odd-even procedure gives 
values which are consistently and significantly higher than those 
obtained by the equivalent forms procedure. Table II shows that the 










Reliability of Total 
Bxaminat ion 


Reliebility of Helf 


700 the Examination 


r r r 
12 oe oe 
tz 
Figure 1. Increase in Reliability Values as Related to 


Method of Calculating Acliebility. (Deta from 
Tadle JI) 


average ‘‘reliability”’ of the total examination is approximately the 
same when computed by the equivalent forms procedure as the average 
“reliability” of half the examination when computed by the odds-evens 
technique based on one form of the examination. Such a result is nothing 
less than amazing, and points to the seriousness of the problem in 
unmistakable terms. 


& #j@, fe8 feet @. gud Oe OQ 8 


ae oF or Sf 4 








Test Reliability 89 

Tables II and III show further that ‘‘reliability”’ is affected by the 
length of the test. Comparing “reliabilities’’ for whole tests predicted 
from fractions one-fourth and one-half the length of the whole, it is 
evident that the shorter the ‘‘form,” the higher the “reliability” of the 
total examination when estimated by the Spearman-Brown formula for 
nforms. The difference between forms of different length is consider- 
ably less than the difference between the equivalent forms and the odds- 
evens procedure. 


TaBLeE IJ.—AveraGe RELIABILITY OF SEVEN Tests aS RELATED TO THE METHOD 
or CALCULATING RELIABILITY 











Form 1 vs. Odds os. evens Odds vs. evens | Odds vs. evens 
Form 1 plus 
Form 2 Form 1 Form 2 
Form 2 
Ti 1 . 584 .700 .733 .718 
2 Ti 
Ti .737 .824 .846 . 836 

















TaBLeE III.—DiFrFerReNcE BETWEEN rf FOR THE WHOLE TEst PREDICTED FROM 
Opps-EVENS Form 1 Pius 2, aND THE r FOR THE WHOLE Trst PREDICTED 




















FROM 
Test Form 1 vs. | Odds vs. evens| Odds vs. evens 
Form 2 Form 1 Form 2 
IO Ss 5.0.0 wdnie dele oduct + .061 — .052 +.011 
ET cds tines seu saadantes + .147 + .055 — .046 
Tn conc tek thee dann +.172 — .106 + .044 
Applied mechanics............... + .187 + .009 — .061 
CS Ce shot os ¥ oko we ese ee a +.015 .000 +.010 
DE Fo ceksk ie dd Hoc maes eens + .021 — .025 — .005 
Analytical geometry.............. + .054 — .005 — .022 
NE be ne ae 4.093 + .017/—.018 + .012|—.010 + .008 





IV. DISCUSSION 


Attenuation coefficients, containing in their denominators coeffi- 
cients of self-correlation, are obtained for two principal purposes, first, 
to find the “‘true”’ stability of a function or trait, or, stated in reverse 
the amount of fluctuation or change in a function or trait, and, secondly, 
to find the “true” relation between qualities, traits, or functions. 
What procedure is most likely to obtain these ‘“‘true”’ relations? 
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The statistical procedure applicable to the problem of change 
depends on the function measured. In dealing with day-to-day change 
of reaction times, Woodrow! computed the ratios of “‘standard devia- 
tion of the daily averages (obtained standard deviation of the aver- 
ages)”’ to ‘“‘the average daily standard deviation divided by 
n(Av.o,,//n).” He refers to day-to-day variability as ‘quotidian 
variability.”” However, it seems that if one desires to subsume all 
variations, which occur in course of time, without considering how the 
changes were brought about, the general term ‘‘temporal variability” 
would be more descriptive. 

Thouless treats the changes which occur over a period of time as 
‘fluctuation of function.”? Obviously, an accurate measure of 
amount of fluctuation, is a prerequisite for determining the causes of 
change of function. Thouless proposes several tests of function fluctua- 
tion: The comparison of the r of equivalent tests with the test-retest r, 
the intra-class r,* the regression of the scores of the first test on the 
scores of the second test, and a double test-retest index of function 
fluctuation. Where A, and B, are items on the first test, and A, and 
B:, identical items on the retest, the index is the correlation of 
(A, — Az) with (B, — B,) divided by the average of ra,s, and 7a,s,. 

The various measures of fluctuation of function based on tests and 
retests indicate the stability or instability of what is measured. If the 
differences in the test and retest are attributable solely to errors of 
measurements it should be possible to apply the attenuation correction 
and obtain an estimated “true”? measure of fluctuation. But the 
response error is always present in test-retest situations. 

Theoretically, fluctuation of function should be measured by com- 
paring ‘‘the ‘true’ instantaneous correlation of two measures of the 
same function,” with ‘‘the ‘true’ correlation when the measures are 
temporarally separated.” 

Where u and v are two measures of the same function, both u and » 
being further divided into u;, and uz, and »; and v2, and the items 
arranged in some Manne", €.9. UyWiUqeUi1 . . . OF UsUgQhiVqiUs . . . , 
such that there is no constant time difference between the items of the 
different series, the ‘‘true”’ relation between u and »v is given by the cor- 





1 Woodrow, H.: ‘‘Quotidian Variability.” Psychological Review, Vol. XX XIX, 
1932, pp. 245-256. 

2 Thouless, Op. cit. 

* Fisher, R. A.: Statistical Methods for Research Workers, Third edition, 1930, 
Oliver and Boyd, Edinburg, Chapter VII, p. 178-211. 
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rection for attenuation.! Let the ‘“‘true’”’ instantaneous relation be 
denoted as 





(ae PEE o 


Now if the u; and u, items are arranged in one test and the »; and 02 
items are arranged in another test, and tests U and V are separated by 
an interval of time, rvy corrected for attenuation will generally be less 
than r’v.v,. Let the “true” relation between u and v when they are 
temporally separated be denoted as r’v.y,. Then it would seem that 
fluctuation, F, could be expressed by the equation 


F, = r'u.Ve = 0 V6 (2) 
or by 


Fy, =r yy, — 1 vev. (3) 


When any coefficient of fluctuation, F, is given, the time interval should 
be stated. 

One desired ‘‘true”’ measure is of fluctuation of function, the other 
desired measure is the ‘‘true”’ relation between functions. In general, 
it is the “‘instantaneous true”’ relation of functions that is sought. If 
it is possible to arrange items measuring the two functions in such a 
manner that there is no constant time difference between them, 
equation (1) would apply. However, it is not always possible to 
measure two functions simultaneously. One functional activity may 
interfere with the other. Various physical conditions may make 
simultaneous administration impractical. When two tests cannot be 
compounded into one unit, a time separation occurs between the two 
measurements. What is the appropriate method for approximating 
the “‘true” relation in such a case? Let u; and ue be measures on one 
function, U, v; and v, measures of a second. Using the short formula 
for the correlation connected for attenuation, 


= Tuv . 
(Tasres : Torn) ”s 


It follows from the demonstrated difference in the self-correlation 
coefficients obtained by the odds-evens and the equivalent forms 
methods, that a lower value will be obtained if the former is used in the 
correction for attenuation than if the latter is used. The purpose of 
the attenuation correction is to compensate for effect of random factors. 
In attempting to get at the ‘‘true” instantaneous relation between two 
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1 Spearman, Carl: Abilities of Man, Macmillan, London, 1927, Appendix, p. i. 
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functions which are not measured simultaneously, it is possible to 
consider that for one of the functions (a) the difference between the 
obtained and the “true” measurement is attributable solely to imper- 
fections in the test, but that for the other, (b), the differences are 
attributable to imperfections in the test, to changes in the individual, 
and to changes in his relation to the response situation. Furthermore, 
it is impossible to consider both of the functions under category (a) 
and it is unnecessary to consider both under (b). Because of this, it 
seems reasonable that the two terms in the denominator should be 
arrived at by different procedures. One term should be an odds-evens 
coefficient. ‘The other should be an equivalent forms coefficient, the 
interval between the measurement of one function and of the other 
being equal to the original interval between the administrations of the 
two forms. If a closer approximation is desired, a second equation 
may be set up in which the statistical treatment of the two functions is 
reversed and the obtained coefficient averaged with the first. 


V. SUMMARY AND CONCLUSIONS 


It has become evident that diverse things are meant by the term 
“reliability.” Different methods of computation have been shown to 
yield distinctly different results. It has been suggested that the term 
be abandoned. Whatever is the ultimate fate of the term, two 
“‘reliability’’ problems will continue to be important to psychology: 
The problem of the temporal variability of function and the problem 
of the relations between functions. Formulae are suggested for 
measuring the ‘‘true” fluctuation of function and the “true instan- 
taneous’”’ relation between functions. 

The experimental data were obtained by administering seven 
objective examinations, prepared in two equivalent forms, to student 
populations ranging from one hundred to three hundred fifty-nine and 
averaging two hundred. The “reliability”’ of each of the examinations 
was determined by the ‘‘odds-evens” and by the ‘“‘equivalent forms”’ 
procedure. The data support the following conclusions. 

(1) Tests yield self-correlation values which are in general functions 
of the method of calculating the values. 

(2) The “‘odds-evens” technique will in general yield higher self- 
correlations than will the “‘equivalent forms” technique. 

(3) Formulae which involve a measure of reliability, for example, 
the correlation coefficient corrected for attenuation, are subject to very 
serious misinterpretations if the systematic differences, dependent on 
the method of calculating the self-correlation, are not recognized. 
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FOUR RETESTS OF A PERSONALITY INVENTORY 


RUDOLF PINTNER AND GEORGE FORLANO 


Teachers College, Columbia University 


If a personality inventory is given several times to the same chil- 
dren, do they make the same score each time? If the scores fluctuate 
from test to test very greatly, is this due to the unreliability of the test 
or to fluctuations in the attitudes of the children from week to week? 
If the scores do not fluctuate greatly, is the stability so obtained due 
to the fact that the children mark the same items in the same way each 
time, or is this stability obtained in spite of the fact that many children 
show changes in the marking of individual items? 


THE TEST 


The Aspects of Personality Inventory’ is a questionnaire con- 
sisting of one hundred fourteen items. It requires the child to read a 
statement and then to mark himself as “‘same”’ or “different.” It is 
divided into three sections as follows: 

Section I contains thirty-five items which attempt to measure 
ascendant-submissive behavior. This is the A-S test. 

Section II contains thirty-five items which attempt to measure 
extrovert-introvert behavior. This is the E-I test. 

Section III contains forty-four items. Nine of these are non- 
significant items which are not scored. The other thirty-five items 
attempt to measure emotional stability. This is the £ test. 

Sample items from the three parts of the test are as follows: 


I don’t mind when other children get ahead of me in line. 
I keep quiet when I am with other people. 

I would sooner say than write what I think. 

I often feel sick when I have to go to school. 

I think my parents pick on me too much. 


Each item of the test has been validated as to its internal con- 
sistency. Each item had to correlate highly with the total score of the 
section to which it belonged. In addition it had to correlate lowly 
with the total scores of the other two sections of the inventory, so 
that the diagnostic significance of each section of the inventory would 





1 Pintner, R., Loftus, J. J., Forlano, G., and Alster, B.: Aspects of Personality 
Inventory. Test and Manual. World Book Co., Yonkers, N. Y., 1937. 
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be as clear-cut as possible. The three sections of the inventory have 
been correlated with CA and MA. Separate correlations have been 
calculated for four hundred nine boys and four hundred girls. The 
correlations with CA fluctuate from —.20 to +.05. The correlations 
with MA fluctuate from +.05 to +.30. Evidently neither chrono- 
logical nor mental age plays any important part in influencing the 
scores of the inventory. 


THE POPULATION AND PROCEDURE 


The population consisted of fifty-eight boys and forty-two girls, 
i.e., a total of one hundred cases, in the fifth grade of an elementary 
public school. The test was administered four times. An interval 
of two weeks separated each of the four trials. 


RELIABILITY OF THE TEST 


The reliability of the test (odd-even scores) for this population for 
each of the four trials has been calculated and corrected by the Spear- 
man-Brown formula. These correlations are shown in Table I. 


Taste I.—Opp-EvVEN RELIABILITY COEFFICIENTS CORRECTED BY SPEARMAN- 
Brown PropsHecy ForRMULA 








Trial A-S Test E-I Test E Test 
lst .61 . 54 .84 
2nd .69 .63 .88 
3rd 71 .67 .87 
4th .75 .57 .93 














There seems to be a slight tendency for the reliability to increase with 
repeated practice. 

The reliabilities of this sample of one hundred cases are similar to 
reliabilities calculated for other samples on this same test. Table II 
gives odd-even reliability coefficients for other samples homogeneous 
as to sex and age. In all of the groups tested it is noticeable that the 
E Test shows ‘consistently higher correlations than the other two 
tests. 


INTER-CORRELATIONS BETWEEN SUBTESTS 


The total test was originally constructed so as to measure three 
separate traits and the items of each subtest were chosen because of 
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TaBue I].—Opp-rven RELIABILITY COEFFICIENTS (CORRECTED BY THE SPEARMAN- 
Brown ForMvULA) FoR AGE AND Sex Groups 


























Boys 
Age group n A-S test E-I test E test 
10 100 .69 .76 91 
11 100 .76 .62 . 87 
12 100 74 .53 .81 
Girls 
10 100 .66 .60 .80 
11 100 77 .58 .82 
12 100 75 .59 .92 

















their lack of correlation with the other subtests. 
standardization of the test the intercorrelations based upon one 
hundred fifty pupils in Grades V and VI were as follows: 


In the original 








Test A-S E-I 
A-S 

E-I 26 

E — .22 .29 











We may now compare these intercorrelations with those obtained 


in this study. 


correlations are as follows: 


We have used the first two trials of each subtest. The 


A-S Trial 1 and E-IJ Trial 1 
A-S Trial 1 and E-J Trial 2 
A-S Trial 2 and E-] Trial 1 
A-S Trial 2 and E-I Trial 2 


A-S Trial 1 and 
A-S Trial 1 and 
A-S Trial 2 and 
A-S Trial 2 and 
E-I Trial 1 and 
E-I Trial 1 and 
E-I Trial 2 and 
E-I Trial 2 and 


Bebe ee & 


= ,.15 
= 25 
= .17 
= .39 
Triall = .09 
Trial 2 = —.25 
Trial 1 = —.21 
Trial 2 = —.15 
Triall = .09 
Trial2= .16 
Triall = .09 
Trial2 = .10 


These intercorrelations are very similar in the two sets of data. 
They are alllow. There seems to be a slight tendency for scores on the 
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A-S subtest to correlate negatively with scores on the E test. Emo- 
tional stability is not marked by ascendant scores. It would seem 
rather to be slightly related to submissive scores. 
EFFECT OF PRACTICE ON SCORES 
The means and sigmas for the four trials are shown in Table III. 


Tasie IJJ.—Mgans AND S1amas For Eacu or THE Four RE-TsstTs 











A-S E-I E 
Trial 
Mean Sigma Mean Sigma Mean Sigma 
1 16.28 3.95 20.81 3.82 25.85 5.63 
2 16.60 4.42 21.03 4.01 27.08 5.93 
3 17.17 4.84 21.27 4.22 27 .06 6.97 
4 17.23 5.35 21.16 4.09 27.15 6.81 























There seems to be a tendency for both the means and the sigmas to 
increase with repeated trials. These increases are all in the direction 
of a more favorable score. We have translated the largest mean gains 
for each test into percentile gains based upon such norms as exist at 
present for these tests. These largest mean gains are equivalent to 
changes of four to nine percentile points. We have also computed 
the significance of the differences between the largest mean gains for 
each subtest and also for the sigmas of each subtest shown in Table ITI. 
The ratios of the difference to the sigma of the difference for each 
subtest are as follows: 





Means Sigmas 





CE Sevincecbbes hieavcedeeus ches bab en epee edeaan 2.24 4.67 
| Nes: Serre eer yey ore Perk Pe eT ee 1.06 1.74 
dis ntdpraninnbaddonseheeavenweneasdweas kkaen 2.67 3.63 











Some of these ratios indicate statistically significant changes due to 
repeated practice. 


STABILITY OF THE SCORES 


The correlation of each trial with every other trial shows how 
stable the scores are over a two-month interval. Table IV shows 
these correlations. 
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In the A-S Test the intercorrelations of the later trials are higher 
than those of the first with the later trials. This tendency is also 
present, but not so marked, in the other two tests. Tests closer 
together in time seem slightly more similar than those further apart. 
In general the correlations are fairly high, considering the reliabilities 
of the tests. On repeated trials of the tests over a two-month interval 
children tend to mark the tests in the same way. To the extent that 
we are measuring personality characteristics, we may say that.such 
characteristics do not fluctuate from week to week. Children tend 
to be consistent in their responses. We may have some confidence, 
therefore, that what we are measuring is not some unimportant trait 
that fluctuates from day to day, but rather something more basic and 
stable in the personality make-up of the child. 


TaBLe IV.—INTERCORRELATIONS BETWEEN THE Four TRIALS OF THE TRST 








Trials A-S E-I E 

1 and 2 .65 .70 .79 
1 and 3 .61 .68 .67 
1 and 4 .62 .63 .65 
2 and 3 .80 .72 77 
2 and 4 .76 .72 .79 
3 and 4 .83 .76 .66 














CONSISTENCY OF RESPONSE 


Children differ very greatly with reference to the consistency with 
which they mark the items of the four trials. A consistency of response 
score was calculated for each item. This consisted of the number of 
items marked on the second, third and fourth administrations in the 
same way as the child had marked them on the first administration of 
the test. For the three subtests, we have the following results: 


~ 








A-S E-I E 
Mean Consistency Score................... 21.6 22.4 24.5 
ahs caD EK ea wi dws ve ding hswdesinvivres 4.6 4.3 6.0 
TN iL ic oi hay ew chiens Cocke eo dewe 7-30 9-30 6-34 














This means that we find children on the A-S Test who check from seven 
to thirty items in exactly the same manner on all four trials. The 
average child checks about twenty-two out of the thirty-five items 
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consistently on all four trials in this test. Comparing tests we note 
that the A-S and E-I tests have very similar means and sigmas, 
whereas the E test has a higher mean but a larger sigma and a wider 
range. This analysis of consistency of response is connected to some 
extent with the relatively high retest reliabilities that we have already 
reported. 

We have enquired further as to whether the consistency of response 
is related in any way to the personality characteristics measured by the 
tests themselves. Is an ascendant or submissive child more consistent 
in his responses? Is an emotionally stable or unstable child more 
consistent and so on? The correlations between consistency of 
response and scores on the tests are given below. The scores used 
here are the average scores for the four trials. 


Consistency of response on A-S and average score on four trials of A-S test = —.03 


PE= _ .067 
Consistency of response on E-] and average score on four trials of E-J] test= .09 

PE=_ .067 
Consistency of response on E and average score on four trials of EZ test= .36 

PE=_ .059 


Evidently ascendant-submissive responses or extrovert-introvert 
responses are not differentiated by greater or less degrees of consistency. 
The submissive child is just as likely to be consistent (or inconsistent) 
as is the ascendant child; and likewise for the introvert and extrovert. 
The correlation of .36 for the EZ test would seem to indicate a slight 
tendency for the more emotionally stable child to be slightly more 
consistent in his responses on the E test than the less emotionally 
stable child. The correlation is not high, but it is high enough to 
suggest that a tendency in this direction exists. This would seem to 
agree with our concept of emotional instability as indicating the child 
who is less stable and, therefore, more flighty and erratic in his behavior. 


THE CONSISTENCY OF THE ITEMS 


In addition to the consistency of the individual child discussed in 
the last paragraph, we have studied the consistency of the individual 
item. For each of the items on the inventory we have calculated the 
percentage of children that did not change their response to that item 
during the four trials of the test. For example, item number twenty- 
seven on the E test showed eighty-nine per cent of the children making 
no change from trial 1 to trial 4, that is, their response to this item was 
the same for four trials. This is the highest percentage of consistency, 
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as above defined, for any item on the test. The lowest percentage is 
for thirty-two per cent for item number seven on the A-S Test. 
We give below the three most consistent and the three least con- 


sistent items for the three subtests: 
































A-S Test 
Consistency; Item 
Sent Tom percentage No. 
I think that friends who don’t agree with me are stupid. . 86 21 
fe DS gn 84 30 
I try to be the first to get on a street car............... 78 2 
I do not like to be the leader in games................. 38 23 
EE ESE IE EOE, 36 29 
I find it hard to talk before other children.............. 32 7 
E-I Test 

I would rather play with other children than play alone. . 84 4 
ok ot ks thas ae cBedan bee sb @ 81 28 
Nee a re oe a eek aeae 80 5 
I like to go camping rather than read about it.......... 80 22 

I like to go around classes, collecting money for the Red 
ee a Sa Sas Bg es SUN hed cule atuiene 80 35 

I want to work alone because I don’t want other people 
SG er err er rere 44 25 
I find it easy to start speaking to a new pupil........... 40 10 
a SE III, og ob o'w 664 cc acbepeessvees 36 20 

E Test 

I am very much afraid of water......................- 80 27 
I like to tease my friends until they cry................ 82 . 36 
Everything gets on my nerves......................05- 81 33 
I am always afraid that sad things will happen to me.... 45 15 
I find it hard to forget my troubles.................... 44 7 

I wish to do the right thing but sometimes I can’t get 
I ss ae ie oR oe eg eee w pes 41 28 











A study of these items, as well as the rest not printed here, seems 
to show no obvious differences between the most consistent and the 
least consistent items. We have examined such factors as word- 
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difficulty and length of sentence, but they show no relation to per- 
centage of consistency. Other factors, such as the definiteness as 
opposed to the vagueness of the situation described, and the probable 
frequency of occurrence in children’s lives, have also been examined in 
relation to consistency, but none of them shows a marked relationship. 
SUMMARY 

The Aspects of Personality Inventory was given four times at two- 
week intervals to one hundred children. The scores made by these 
children show a slight tendency to rise due to practice effect. The 
stability of the scores is fairly high. Evidently the test is measuring 
something fairly stable in the personality make-up, at least over a 
period of six weeks. The items have been studied with reference to 
consistency over this six-week period and have been found to vary 
greatly among themselves, the highest having a consistency percentage 
of eighty-six and the lowest of thirty-two. No clear differentiation 
between the most consistent and the least consistent items has so far 
emerged. 
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COMPARISON OF METHODS OF STUDY 
FOR IMMEDIATE AND DELAYED RECALL 


C. O. MATHEWS 
Ohio Wesleyan University 


In a previous article by the writer and Miss Nora Toepfer! a com- 
parison was made between those principles of effective study listed 
in widely used books on study procedure and those used by good 
and poor students in the high school. In that study it was shown 
that note-taking of various kinds, outlining, and underlining are 
frequently recommended procedures for efficient study. There is a 
dearth of experimental literature, however, dealing with most of the 
principles recommended by the authors of these widely used guides. 
It seems wise to extend experimentation along these lines until more 
valid guideposts for effective study procedures are established. 


THE PROBLEM 


To what extent does the underlining of important concepts and 
ideas, as one reads, aid him to comprehend and to see the relation- 
ships among these ideas? Is this a more or less effective study pro- 
cedure than taking outline notes or spending one’s whole time reading 
without taking notes at all? It was for the purpose of investigating 
these problems that this study was conducted. 


SELECTION OF MATERIALS 


A passage of about two thousand words describing the schools of 
Ancient Greece was selected as suitable study materials. This 
passage was especially written for high-school pupils by Charles A. 
Beard and W. G. Carr.? It contained elements of interest, a vocabu- 
lary which seemed suitable, and an underlying outline showing 
careful organization. Also, the ideas portrayed were quite condensed 
in a manner similar to high-school textbooks. 

Mimeograph copies of this article were prepared and on top of 
each was attached a set of directions to the student. Three forms of 
directions were devised so that each third of the subjects used a 





1 Mathews, C. O. and Toepfer, Nora: ‘‘Comparison of Principles and Prac- 
tices of Study,”’ School Review, Vol. XLIV, March, 1936, pp. 184-192. 

2 Beard, Charles A. and Carr,. William G.: ‘“‘Schools of Greece and Rome.” 
The Journal of the National Education Association, Vol. XXIII, December, 1934, 
pp. 230-232. 
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different method of study. The factors which characterized the three 
methods are shown in the following quotations from these directions. 


Form A.—Do not use a pencil while you study. Do not take notes of 
any kind and do not mark on this reading material. Try to remember what 


you read. (Some of your neighbors are being asked to use their pencils for 
various kinds of notes.) 


Form B.—As you read, underline the most important ideas. Be careful 
not to underline too much. Although there is usually only one important 
idea in a paragraph, you may find some paragraphs with two or three impor- 
tant points. If you desire to make any notes on the margins of the sheets 
to indicate which are the main ideas and which the less important ones you 
may do so. 


Form C.—Your teacher will supply you with a sheet of paper on which 
you may make notes while you read. (Be sure your name is on the sheet of 
notes.) Make your notes brief, including only the more important ideas. 
Look for the relationship of ideas and make your notes in outline form. Do 
not spend much time writing or you will not have time to read all the material. 
Do not write or mark on these mimeographed sheets. 


Each teacher who coédperated by giving these materials, arranged 
the forms in rotation order before passing them out to the pupils. 
Thus, one-third of the children within each room studied each form, 
the forms rotating in each row as the children happened to be seated. 
It was thought that this procedure would insure samplings of subjects 
of about equal ability. 

The pupils were instructed to read the text just as they would an 
assignment in a book, trying to understand it so that they could tell 
the main ideas or pass a test on it. Exactly twenty-five minutes were 
allowed for the study period. The reading materials then were 
collected and tests were given to measure the comprehension and 


knowledge of facts and the relationships between the important ideas 
involved. 


THE TEST 


All pupils were given the same test, for which they were allowed 
exactly fifteen minutes. The test was divided into two major parts. 
The first part consisted of twelve multiple choice items devised in 
such a way as to give a measure of understanding and memory of 
factual details. The second part was in the form of an outline scheme 
whereby the student could show his understanding of the organization 
of the materials by arranging suggested points in order of an adequate 
relationship. A perfect score on the entire test was twenty-four points, 
the maximum score on each part being twelve. The reliability coeffi- 
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cient of the test, determined by correlating the odd and even items and 
using the Spearman-Brown prophesy formula, was found to be .85 on 
the basis of one hundred sixty-four test records of ninth-grade pupils. 


SELECTION OF PUPILS 


Seven hundred thirty-five pupils took both the test for immediate 
recall and the one for delayed recall (a repetition of the same test 
at the end of one month). These pupils represented ten different 
high schools. They were distributed as shown in Table I in respect 
to the form of directions for study and grade levels. 


TaBLE I.—DistTrRisvuTion or Pupits AccorpING To GRADE LEVELS AND Forms 
or Srupy DIRgcTIONS 











Form of study directions! 
Grades Total 
A B C 
SE Ms SEN Saka weer cia toes cea nha 139 136 130 405 
ES IOs cola kee cpAnedh oe Wabetacde 112 109 109 330 
eR a gO I a Pe bv 251 245 239 735 

















1Form A. Read only; make no notes. 
Form B. Read, underline, make marginal notes. 
Form C. Read, make outline notes. 


RESULTS 


The results for each form of study directions were tabulated 
separately for Grades IX and X, XI and XII, and [X to XII, inclu- 
sive. The means and standard deviations of the distributions for 
these various categories were computed and are given in Table II. 
In this table “‘Facts” refers to the first part of the test dealing 
with memory for detail facts; ‘‘Outline”’ refers to the second portion of 
the test measuring memory for and ability to organize the chief points 
of the passage into the outline followed by its authors. Roman 
numerals in the third column refer to the time at which the test was 
given; I, immediately after the passage was read, and II, one month 
later. These are measures, respectively, of immediate and delayed 
recall. 
_ The relative effectiveness of the three forms of study directions 

for immediate recall may be seen by comparing the means in this table 
for those rows marked I. The differences between these means and 
the standard errors of these differences are shown in Table III. A 
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plus difference in this table is one which favors the first member of the 


pair of forms involved and a minus difference is one which favors the 
second member. 


Taste I].—Tue Mrans AND STANDARD DEVIATIONS OF THE DISTRIBUTIONS OF 
Scores ON IMMEDIATE AND DgLayEeD Recautt Tests ror Eacu Form or 
Srupy Drrectrons 




















Form of study directions 
Portion | Immediate 
Grades of (I) or de- A B C 
test layed (II) 
Mean | SD | Mean; SD | Mean; SD 
Facts I 9.85 | 1.85 9.40 | 1.91 9.45 | 2.10 
II 8.83 | 2.18} 8.82 | 2.14| 8.85 | 2.18 
IX and X | Outline I 5.41 | 2.68 | 4.49 | 4.07 | 5.28 | 3.99 
’ II 4.53 | 3.78 | 4.00 | 3.42 | 4.72 | 3.73 
Total I 15.26 | 5.42 | 13.89 | 5.08 | 14.74 | 4.90 
II 13.34 | 5.20 | 12.82 | 4.87 | 18.57 | 4.97 
Facts I 10.79 | 1.71 | 10.69 | 1.44 | 10.61 | 1.83 
II 10.19 | 1.70 | 9.92 | 1.91 | 9.88 | 2.07 
XI and XII | Outline I 7.77 | 4.17 | 6.83 | 4.30 | 7.67 | 4.04 
II 6.54 | 4.05 | 5.67 | 3.85 | 6.28 | 3.92 
Total I 18.54 | 5.26 | 17.50 | 5.15 | 18.29 | 5.19 
II 16.73 | 5.14 | 15.59 | 4.79 | 16.16 | 5.27 
Facts I 10.27 | 1.85 | 9.98 | 1.83 | 9.98 | 1.99 
| II 9.43 | 2.10; 9.31 | 2.11 9.32 | 2.19 
All grades | Outline I 6.46 | 4.32 5.53 | 4.33 6.37 | 4.19 
II 5.42 | 4.03 | 4.74] 3.71] 5.43 | 3.90 
Total I 16.72 | 5.59 | 15.50 | 5.41 | 16.36 | 5.34 
II 14.85 | 5.44 | 14.05 | 5.038 | 14.75 | 5.27 





























It will be noticed that none of these differences is statistically 
reliable. However, Form A of the study directions (reading only) 
consistently produced the largest scores both for factual memory and 
ability to discover and reproduce the outline of the passage. Form C 
(reading and outlining) seemed to be more effective than Form B 
(reading, underlining and making marginal notes) in aiding the indi- 
vidual to grasp the relationship of ideas as measured by the outline 
test. There was no difference in the effectiveness of these two methods 
of study so far as memory for facts was concerned. 

The relative differences of the three methods of study for delayed 
recall (measured by giving the same test to these children again 
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TaBLe II].—DirrerREences OF MEANS AND STANDARD ERRORS OF DIFFERENCES 


FoR VARIOUS Forms or Srupy DrrectTions on IMMEDIATE RECALL TEstT 


























: Form of study directions 
crnden | Toe 
“4 A-B A-C B-C 

Facts 45 + .27 .40 + .28 — .05 + .25 
IX and X Outline .92+ .42 13+ .42 — .79+ .49 
Total 1.37 + .63 .52 + .63 — .85+.61 
Facts 10+ .21 18+ .24 .08 + .22 
XI and XII Outline .94+ .57 10+ .55 — .84+ .57 
Total 1.04+ .70 .25 + .70 — .79+ .70 
Facts .29+ .17 .29 + .17 .00+ .17 
IX, X, XI, XII | Outline .93 + .39 .09 + .38 — .82+ .39 
Total 1.22+ .49 .36 + .49 — .86+ .49 








without previous notice after a period of one month) may be seen in 
TableIV. This table contains the differences of means in rows marked 
II, of Table II, and the standard errors of these differences. 


TaBLe [V.—DirrerReENces OF MEANS AND STANDARD Errors oF DIFFERENCES 


Fork Various Forms or Srupy Drrecrions on De.tarep Recatu Trstr 

















é Form of study directions 
ee 
¥ A-B A-C B-C 

Facts 014.26) —.02+.27| —.03+.27 

IX and X Outline 563+ .43 | —.19+.46| —.72+.44 
Total .52+ .61 — .23+ .62 — .75+ .60 

Facts .27+.24 .31 + .26 .04 + .27 

XI and XII Outline .87 + .53 .26+.54| —.61+.53 
Total 1.14+ .67 .57+.70 | —.57+.68 

Facts .12+.19 .11+.19 — .01+ .20 

IX, X, XI, XII Outline 68+ .35 | —.01+.36| —.69+.35 
Total .80 + .47 10+ .48 | —.70+.47 

















It will be noticed that one month after the materials were studied 
those who used method A still surpassed those who used method B, 


Method C seemed’ to 


both in memory for facts and in outlining. 


gain in effectiveness over method A for the ninth- and tenth-grade 
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pupils, but method A showed larger differences over C for eleventh- 
and twelfth-grade pupils than it did on the immediate recall test. 
Method C surpassed B in terms of the student’s efficiency to reproduce 
the outline of the passage, but not in producing memory for factual 
details. 

It is recognized that these differences are not entirely reliable from 
a statistical point of view and that arguments based upon what seems 
to be trends and indications of consistency may be in error. How- 
ever, in summary, it appears that when time was kept constant merely 
reading the materials with intent to understand and remember was 
more effective than underlining and making marginal notes or out- 
lining for immediate recall, and more effective than underlining and 
making marginal notes for delayed recall. Outlining the material 
studied was more effective for delayed recall at the ninth- and tenth- 
grade level, but merely reading the material was more effective at 
the eleventh- and twelfth-grade level. Of course, since time was kept 
constant those who did not make notes or outline the materials had 
more time to read or repeat the reading of difficult passages. 

Broad generalizations in regard to the relative effectiveness of 
these study procedures are hardly justified from the data given. The 
tests somewhat controlled the pupils’ responses and thus they may 
not be the best tests possible of memory for facts and of ability to 
recognize and reproduce the outline of a passage. An examination 
of the means (and the original distributions) leads the writer to believe 
that the factual test was rather easy for eleventh- and twelfth-grade 
pupils and that the outline test was rather difficult for ninth- and 
tenth-grade pupils. If these pupils had been taught to outline or to 
underline effectively before the investigation, the results might have 
been different. If the schools concerned could have justified more 
elaborate experimentation so that both the materials and the test 
could have been more extensive and complex, the procedures might 
have been shown to possess different relative values. However, such 
investigations as this one make evident the need of caution in accept- 
ing wholesale such statements as: ‘“‘ Your knowledge of the subjects 
you are studying is not likely to be any better than the notebooks 
which you construct” and “It is unfortunate that .. . the art of 
making marginal notes and of underlining important sentences has 
been frowned upon.” It furthermore emphasizes the need for com- 
prehensive research in study methods under conditions similar to 
those which prevail in schools. 
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AN EXPERIMENTAL STUDY OF THE EFFECT OF 
NEUROTICISM UPON AGE-GRADE STATUS OF 
CHILDREN 


FRED BROWN 
Child Study Department, Minneapolis Public Schools 


A number of investigators in the field of child psychology and 
education have commented upon the positive relationship between 
neuroticism and age-grade status. Gates! believes that the unstable 
child cannot indefinitely keep pace with those of greater poise but 
otherwise equal endowment, and that the emotional child lags behind 
in his intellectual possibilities Hollingworth? agrees with Burt* that 
“ . neurotic children are often deficient in reading, though they 
may be intelligent.”” She further assumes that neurotic children fail 
in tasks which demand persistence, codperation, deliberation, and 
attention. Stress is laid upon the belief that “‘General nervous insta- 
bility naturally tends to failure in any school subject which demands the 
qualities of character mentioned above as essential to the mastery of 
reading.”’ It is her opinion that the relationship between stability and 
intellect might be positive and high but not perfect. Thorndike‘ 
expresses the belief that emotionality and intellectual efficiency are 
antagonistic. 

These statements, since they are unsupported by experimental 
evidence, must be regarded as value-judgments. As such they are 
interesting conjectures which serve to stimulate research but offer no 
solution to the problem. 

Two outstanding experimental attacks upon the problem have been 
made in recent years. In 1925 Rosen‘ paired fifty neurotic children 
with fifty normals, and found the neurotic group to be decidedly below 
average in age-grade status and below par in intelligence. In 1930 





1Gates, A. I.: Psychology for Students of Education, 1923, Chapters VIII 
and IX. 


? Hollingworth, L. 8.: Special Talents and Defects. The Macmillan Com- 
pany, 1923, pp. 69-70. 


+ Burt, Cyril: ‘‘The unstable child.”” Child Study Magazine, Vol. X, Oct., 
1917, pp. 61-79. 

‘ Thorndike, E. L.: Educational Psychology, Vol. III, 1913, p. 363. 

5 Rosen, E. A.: A Comparison of the Interests and Educational Status of Neurotic 
and Normal Children in Public Schools. Teachers College, Columbia University, 
Contributions to Education, No. 188, 1925. 
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Keys and Whiteside! selected neurotic children upon the basis of 
teacher ratings and scores on the Woodworth-Cady questionnaire. 
These children were compared with normals for age, grade, sex, and 
intelligence. They conclude: “Children in grades VI to VIII who 
were characterized by their teachers and by their own scores on the 
Woodworth-Cady questionnaire as conspicuously nervous and emo- 
tional display a strong and reliable tendency to average more than a 
year retarded in age-grade standing, nearly two years lower in mental 
and educational age and 18 points lower in IQ as compared with their 
more stable mates.” 

An evaluation and criticism of these studies must be preceded by a 
consideration of the causality factor present in all relational investiga- 
tions. Is the neurotic child lacking in those traits which conduce to 
academic success or do these data indicate that children of low intelli- 
gence develop neurotic traits because of their inability to cope with the 
intricacies of a complex environment? If we disregard the vague con- 
notations of the term “‘neurotic” and assume that the afflicted child 
is making affective trial-and-error responses to an inadequately 
discriminated obstacle in the social environment? it becomes obvious 
that low intelligence would tend to increase the number of such 
obstacles while it would concomitantly interfere with the discovery of 
adjustive responses. It might also be argued that the child of superior 
intelligence would tend to discriminate more problem situations in the 
environment because of his greater needs and keener insight into the 
value of satisfying these needs despite the repressive attitudes of 
adults. The discovery of a positive relationship between neuroticism 
and retardation would, therefore, offer no evidence that neuroticism 
as such predisposes to age-grade lag. Another important factor to be 


considered in such studies involves the question of socio-economic : 


status. Studies by Brown’ and Brill‘ have shown that neglect of this 





1 Keys, N., and Whiteside, C. H.: ‘The relation of nervous-emotional stability 
to educational achievement.” Jour. of Educational Psychology, Vol. XXI, 1930, 
pp. 429-441. 

2 Brown, F.: “The problem of nervousness and its objective verification in 
children.” Jour. of Abn. and Soc. Psychol., Vol. XXXI, July-Sept., 1936, pp. 
194-207. | 

* Brown, F.: ‘‘A comparative study of the influence of race and locale upon 
emotional stability of children.” Jour. of Genetic Psychol., Vol. XLIX, 1936, 
pp. 325-342. 

‘Brill, M.: “Studies of Jewish and non-Jewish intelligence.” Jour. of Ed. 
Psychol., Vol. XXVIII, May, 1936, pp. 331-352. 
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important variable may lead to serious distortion of results in all | 
studies in which groups are being compared for various traits. in 
In Rosen’s study the criterion of stability was based solely upon the vat 
diagnosis of a single school psychiatrist. Her cases were paired by age } 
and grade and then investigated for differences in emotional stability. f 
No cognizance was taken of differences in socio-economic status. It is | 
questionable, therefore, whether her results offer acceptable evidence 
in one direction or another. 
Several cogent criticisms may be directed against the findings of 
Keys and Whiteside. In the first place, teacher ratings are notoriously 
faulty. Fleming and Fleming! compared teacher ratings with scores 
on the Mathews revision of the Woodworth questionnaire. Eighty- | 
eight ratings of girls in the Horace Mann School for Girls of Teachers \ 
College were obtained in the seventh, eighth, ninth, and tenth grades. ‘ 
Raters were carefully selected from the standpoint of familiarity with } 
the ratees. A correlation of —.141 was obtained between teachers’ | 
estimates of emotional balance and the Mathews revision. Correla- i 
| tions of —.139 and —.195 were obtained with age partialed out and : 
, age added, respectively. If, therefore, this criterion is intended to oh 


| establish the validity of cases selected, the final results must be inter- i 
preted with extreme caution. Secondly, the instrument utilized is . { 
based upon the Woodworth PD sheet and the Johnson revision. The ; 


twelve additional items bore directly upon the habits and general 
behavior of incorrigibles. The authors mention eighty items in this 
inventory, but no data with regard to the validity of these additional 
items are offered. Thirdly, only thirty-two children in all were used in . 
the final comparison. This would appear to be too small a number for 
far-reaching conclusions. Finally, no differentiation between the two 
groups upon the basis of socio-economic status is mentioned. In the 
light of these criticisms it is hardly possible to reach definite conclusions 
with regard to the problem other than a conviction that it merits 
y further investigation. 

In an earlier paper? the writer has described his personality inven- 
tory for children. Its standardization upon sixteen hundred sixty- 
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1 Fleming, E. G., and Fleming, C. W.: ‘‘The validity of the Mathews revision 
of the Woodworth Personal Data questionnaire.” Jour. of Abn. and Soc. Psychol., 
Vol. XXIII, Jan.-Mar., 1929, pp. 500-506. \ 

? Brown, F.: “A psychoneurotic inventory for children between nine and 
; fourteen years of age.’”’ Jour. of Appl. Psychol., Vol. XVIII, August, 1934, 

pp. 566-577. 
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three children between nine and fourteen years of age; the inclusion 
of the socio-economic factor; and consistent odd-even reliabilities of 
between +.896 + .007 and +.928 + .014 would seem to make this 
instrument a particularly valuable one in the study of neuroticism 
among children. It should also be pointed out that the validity of the 
instrument is based upon an exhaustive examination of the literature 
dealing with neuroticism in childhood, followed by rigid statistical 
validation of the items so obtained. 

In the present study forty-four neurotics and forty-four normals 
of both sexes were paired with each other. Cases were selected at 
random from a much larger number of school children to whom the 
inventory was administered. Socio-economic status! and racial strain 
were kept constant. Since the mean score on the inventory for the 
entire original group (sixteen hundred sixty-three cases) was approxi- 
mately twenty points, we selected all cases below twenty-two points 
for the normal and all above thirty points for the neurotic group. The 
mental ages and IQ’s are based upon Morgan’s Mental Test which 
was found by Growden? to correlate +.814 + .01 with the Stanford- 
Binet test. This correlation was obtained from the results on five 
hundred cases examined at the Ohio State Bureau of Juvenile Research. 
In 69.4 per cent of these cases the variation in mental ages between the 
two tests, which were administered at practically the same time, was 
not more than one year. Of these, 45.6 per cent varied under six 
months. These correlations were comparable with those obtained on 
two hundred ninety cases between first and second Stanford-Binet 
scores, and between second and third Stanford-Binet scores. Ther in 
the former instance was +.819 + .013 and in the latter +.835 + .012. 
Table I shows our distribution of cases. No effort was made to differ- 
entiate between the sexes since our data on sex differences show that 
there are only fifty-eight chances in one hundred of a true difference 
between the means. 

In Table I it can be seen that in every case the difference between 
the normal and the neurotic group is practically certain with respect 
to neuroticism. It is only in the fourth grade that some difference in 
mental age and IQ favors the normal group. It should be noted, 
however, that there is no difference in age-grade status! This absence 
of a reliable difference in intelligence or chronological age between the 
two groups in the sixth and eighth grades would seem to contradict the 





1 The Sims Socio-Economic Score Card was utilized for this purpose. 
? Growden, C. H.: Unpublished data. 
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findings of previous investigators.! It is also interesting to note that 
no differences in intelligence would probably have been found had we 
used the same grades as those of Keys and Whiteside. 


DISCUSSION OF RESULTS 


In order to understand more completely the reason for the dis- 
agreement between our results and those of previous investigators it is 
necessary to clarify somewhat the concept of neuroticism. As we 
have mentioned above, the teacher is likely to overlook the difference 


TaBLE I.—ComMPARISON OF AGE-GRADE STaTUus AND INTELLIGENCE BETWEEN 
Neurotic aND Normat Scuoot Cuitpren: Grapes IV, VI, VIII 
(N = 88; Forty-four Neurotics Paired with Forty-four Normals, Both Sexes) 














Neurotic Chances Normal 
in one 
Diff. ; : : hundred 
Diff.) | Diff. /e(Diff. 

M'—M? ce. cpataiantaaas of true 
Mean! ¢ | e¢(m) differ- |Mean|; ¢ | «(m) 

ence 
Inventory.| 40.35) 6.88) 1.5 28.90 1.984 | (Grade 14.5 100+ | 11.45) 5.93) 1.3 
— epapaaiabs 116.30) 6.08) 1.3 .25 1.770 IV.) .14 55 116.05) 5.81; 1.2 
iv cccucs 112.60)16.4 3.6 14.50 5.307 2.73 99.6 |127.10/17.6 | 3.9 
eC tes 98.50)15.7 3.5 11.55 5.315 2.17 98 110.05)18.2 | 4.0 
Inventory.| 41.76) 7.45) 2.0 29.15 2.332 | (Grade 12.50) 100+ | 12.61) 4.41) 1.2 
ae 139.23) 6.38) 1.7 2.54 2.267 | VI.) 1.12 86 136.69) 5.75) 1.5 
Se 149 .84/30.86 8.5 1.84 9.571 .19 58 148.00|15.94) 4.4 
Diosact si 108 .23|24.67| 6.8 1.08 7.839 .13 55 107 .15)14.18) 3.9 
Inventory.| 37.63) 6.16) 1.8 23.45 2.617 | (Grade 8.96) 100+ | 14.18) 6.39) 1.9 
Ginette so 163.36) 7.79) 2.3 3.55 2.801 VIII.) 1.20 88 159.81) 5.56) 1.6 
ts nS 175.09/34.05) 10.2 2.18 11.230 .19 58 117.27|15.65| 4.7 
ae 107 .81/23.90| 7.2 2.91 7.689 .37 64 110.72) 9.01) 2.7 



































between a hyperactive child whose inferior mentality may cause him 
to be bored by the demands of the classroom and the truly neurotic 
child who offers no problem because of his withdrawal from the social 
phases of classroom interaction. Wickman? has shown that teachers 
rate seclusiveness and withdrawal in the school child as relatively 





? All children were examined toward the end of the school year. This would 
necessitate adding approximately eight months to all chronological ages in order 
to raise them to the average age for each grade. 

* Wickman, E. K.: Children’s Behavior and Teacher's Attitudes. New York, 
Commonwealth Fund, 1928. : 
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inconsequential while sex problems and aggressive actions toward 
persons and property are rated as very serious. To the school teacher 
who is rating children for neuroticism motor activity would seem to be 
a criterion of the condition. 

In consulting our data we find that the items as scored by our 
major group tend to substantiate the concept of neuroticism held by 
mental hygienists. By testing the validity of items through the use 
of the Upper Versus Lower (UL) and Overlapping (OL) methods! it 
was possible to eliminate those items which showed more than twenty 
per cent overlapping between the two criterion groups.? These items 
include physical restlessness (OL twenty-nine per cent), left-handedness 
(OL thirty-two per cent), nail biting (OL twenty-one per cent), and 
pencil chewing (OL twenty-three per cent). Those items which are 
really indicative of neuroticism and show a low percentage of over- 
lapping are: Difficulty in speaking when called upon by the teacher 
(OL seventeen per cent; UL 235), difficulty in paying attention (OL 
fifteen per cent), and feeling that grades are lower than they ought 
to be (OL twenty per cent). This last response may be caused by 
the tendency on the part of teachers to underestimate the intelligence 
and scholastic ability of the withdrawing child. 








Taste II 
Item UL OL, 
per cent 

Frequent loss of temper by father......................4.. 137 10 
i ES a A ea en ee Pe a Ls 96 18 
ES Se GOUNOOE, oo ew sdcwecsccnencesockweced 132 10 
cui sanmew hs occensveosnecsen 111 10 
Parenta! impatience and irritability..................0...... 106 12 
ian ahaa ap eauin de pe mae ne eee 200 17 
EST EPO OL Ce aE EPS IE Fe Pee Fe: 168 18 











Another factor to be faced in dealing with this entire problem 
involves the question of specificity. Does the neuroticism of the 





1Lentz, T. F. Jr., Hirshstein, B., and Finch, J. H.: ‘‘ Evaluation of methods of 
evaluating test items.’”’ Jour. of Ed. Psychol., Vol. XXIII, 1932, pp. 344-350. 

2 A complete description of these methods and their application to the problem 
of item selection in the study of neuroticism among children may be found in: 
Brown, F.: An Experimental Study of the Psychoneurotic Syndrome in Childhood. 
Ohio State University Doctoral Dissertation, June, 1933, pp. lx + 165. 
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maladjusted child always manifest itself in the classroom? Is neu- 
roticism a specific or general condition? Is it possible that mal- 
adjustment in the home might cause a child to score poorly on a 
psychoneurotic inventory when the instrument is being administered 
in the classroom, while at the same time no neurotic behavior can be 
observed in that situation? Our data show that responses to the 
items relating to school behavior may be fairly good while those 
pertaining to the home may be quite bad. The following UL and 
OL figures for home items indicate that the neurotic and normal 
children differ sharply in their responses to these items. It is not 
unlikely that the neurotic child may at times find the school a welcome 
escape from the oppressive environment of the home. It has appar- 
ently been assumed in other studies that neuroticism is a general 
trait. It would probably be nearer the truth to regard it as more 
specific than general in its manifestations. If so, there is no reason 
for assuming that the neurotic child must necessarily be retarded in 
age-grade status because of his emotional problems. 
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ATTITUDES OF COLLEGE STUDENTS AND THE 
CHANGES IN SUCH ATTITUDES DURING 
FOUR YEARS IN COLLEGE, PART II 


VERNON JONES 
Clark University 
(Continued from January Journal.) 


II. THE RELATION OF INTELLIGENCE, AGE, MAJOR SUBJECT, 
POLITICAL PARTY, AND RELIGIOUS AFFILIATION TO ATTITUDES 


Having shown in Section I that there was a small but reliable change 
in attitudes toward war, religion, and the church between the freshman 
and senior years in college, we now turn to the study of the relation of 
certain other factors to attitudes. The first factor which we shall 
consider is scholastic aptitude or intelligence as measured by the 
American Council Psychological Examination. 

Intelligence.—To what extent is liberalism or conservatism in atti- 
tude toward war and religion related to general intelligence? Is there 
any tendency for the brighter students to be more liberal or vice versa? 
Twenty-five correlations were computed in different groups and sub- 
groups in order to get data on this question. The results are given in 
Table IV. In this analysis we have not been content to depend 
exclusively upon r’s based on total groups. It can be seen that if one 
religious sect drew (by family allegiance or emotional appeal) the less 
intelligent on the average and tended to give very conservative train- 
ing, and if another sect typically drew the more intelligent and gave 
very liberal training, this condition would spuriously “‘create’”’ cor- 
relation within the composite group, even though there was little or no 
correlation between intelligence and attitude within one sect which 
was more or less homogeneous in religious training. Whether or 
not the correlations were higher in some sects than in others will be 
revealed by the analysis in Table IV. 

From the bottom row of the table it will be seen that the r based 
on the total freshman group is +-.06 for intelligence vs. war, and about 
+.25 on the average for intelligence vs. religion—separate r’s being 
+.28, +.20, and +.23. Theser’s are all positive, indicating within the 
range of their reliability a slight tendency for high intelligence and 


liberal attitudes to go together. But we should hasten to add that the 
114 
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r’s are too low to justify any claim to very significant relationships.’ 
In the senior class the r between intelligence and war is +.21 and that 


These like the freshman r’s 
are chiefly characterized by their smallness. 


between intelligence and religion is +.11. 


TaBLs I[V.—CoRRELATIONS BETWEEN INTELLIGENCE AND Sratus Scores 
on CERTAIN ATTITUDES! 














1; Intelligence vs. 
- eng o religion Intelli- s No. of cases 
(reality) gence ve. Intelli- 
ke gence vs. 
religion 
' church, 
(influence), Seschin 
Fresh-|. . | Fresh-|, | freshmen | “| Fresh-| 
Seniors Seniors Seniors 
men men men 
Protestant........... 00 | +.23 | +.29/) +.18 +.31 +. 152 46 
GS cob cb ccdcades —.17 | —.37 | +.05 | +.15 +.11 +.11 63 19 
Di dikh ke coneaie ts +.18 | —.06| +.15| —.01 +.19 +.16 53 13 
EE dnc vcsvneel 6o0es ET aaeke a webee a = is onees ee aeel ey wi 22 
Total group.......... +.06 | +.21 | +.28/] +.11 + .20 + .23 268 100 





























1 The correlation between intelligence and attitude toward the Negro was not computed, but in 
view of the average attitude for each quintile of intelligence we estimate that it would have been 
positive, but less than +.15. The average attitude for the various quintiles of intelligence from 
highest to lowest are: First, 7.61; second, 7.54; third, 7.16; fourth, 7.36; and fifth, 6.97. . 

When we turn from the r’s based on totals to those based on the 
separate religious groups we see that the fact of the lowness of the rela- 
tionship between intelligence and these attitudes even more forcefully 
presented. In attitude toward war the r’s range from —.37 to +.23 
with a median of —.03. In attitudes toward religion and the church 
the r’s range from —.01 to +.31, the median being +.15. Thus we 
see that these r’s are extremely low on the average and variable from 
subgroup to subgroup. In the Catholic group, for example, the r 
between intelligence and war is negative in the case of both freshmen 
and seniors. The r’s in this group for intelligence vs. religion hover 
around +.11. The largest correlations for any subgroup are those for 
the Protestants. The differences between these r’s of the various sub- 
groups we attribute largely to the differences in the training received 





? How low a correlation of .25 really is is shown by the fact that if one attempted 
to predict from it the standing of students on an attitude scale from knowledge 
of their intelligence scores, the error of the predictions would be ninety-seven per 
cent as large as predictions based on zero correlation. This is based oa the 
interpretation of r in terms of the coefficient of alienation. The coefficient of 
alienation in this case is .968. See Kelley, T. L., Statistical Method. Pp. 173-174. 
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by these groups, particularly in home and church. Such differences in 
training, we believe, can largely if not completely offset any differences 
in liberalism that might otherwise attend greater intelligence. 

In summarizing the results in Table IV, we may say that the general 
lowness of the r’s coupled with the striking inequality of the r’s from 
subgroup to subgroup stress the absolute futility of attempting to 
predict the conservatism or liberalism of students on the basis of intelli- 
gence scores alone. There is a low positive correlation on the average 
between intelligence and liberalism. But at least in the attitudes here 
measured it cannot be said, as the unanalyzed 7’s of a few experimenters 
have apparently led some to believe, that the duller students, regardless 
of training, are notably the more conservative and the brighter neces- 
sarily the more liberal. 

Age.—The next variable studied in relationship to the attitudes 
was age. Here an attempt was made to get some evidence on the 
question as to whether or not part of the change made during attend- 
ance at college was due to mere maturation and increase in experience 
which would naturally result irrespective of schooling. It was impossi- 
ble to get groups satisfactorily equated in all relevant variables to 
study this problem satisfactorily, but a little evidence was gotten by 
comparing two groups of students of a given class who differed from 
each other in age by one year, and then comparing this difference in 
turn with that between students of two different classes which differed 
by thissame amount. It was found, in comparing twenty-eight young 
freshmen with twenty-eight others who were older by one year, that 
the older group was not more liberal, as might have been expected from 
the fact that seniors were more liberal than freshmen, but slightly 
more conservative in every attitude studied. The differences were as 
follows: for War, —.47; Influence on Conduct, —.64; Reality, —29; 
and Church —1.31. In comparing a group of forty-three freshmen, 
corresponding in age with the above-mentioned freshmen, with forty- 
three sophomores one year older, the latter were more liberal by a slight 
amount on every attitude. The differences were as follows: For War, 

+.19; Influence on Conduct, +.62; Reality, +.37, and Church, +.34. 

Taking these results at their face value it seems that, though 
increase in school advancement—coupled of course with increase in 
age—leads to slightly greater liberalism, there is no reason to believe 
that increase in chronological age alone leads toward liberalism. 
There is one consideration, however, which suggests that the differences 
between the younger and the older freshmen cannot be taken at quite 
their face value, namely, the fact that the younger and older groups 
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in the freshman class differ in intelligence, and if intelligence is signifi- 
cantly related in a positive way with liberalism in these attitudes, 
then these negative differences may be wiped out or even converted 
into positive differences. It is to be regretted that it has been impos- 
sible to control this variable satisfactorily. However, it has been 
shown in the last section that the relationship between intelligence 
and standing on these attitude scales is low, and our estimate is that 
the small positive r’s found are just about sufficient to convert the 
negative differences between the younger and the older freshmen to 
approximately zero on the average. ‘This would mean that the change 
from the freshman to the senior year in chronological age, divorced 
from the experience attending schooling, would not measurably affect 
such attitudes as are here studied. It would imply that on the average 
changes in attitude between ages eighteen and twenty-two due to mere 
growing older would be very slight or nil. 

Major Subject.—In College X every student is required to select a 
major subject of specialty in which he is required to take at least 
twenty-four semester hours of work. It seemed to be quite important 
to the task at hand to determine the degree to which attitudes were 
differently influenced by these different fields of specialty. The major 
subjects selected for study were grouped under four headings: natural 
science, English and foreign languages, economics and sociology, and 
history and geography. 

The results concerning the relation of these fields to each of the 
attitudes are given in Tables V and VI. The former table presents the 
average attitudes of seniors in the different major subjects. In order 
to get a check on the reliability of the results these status scores have 
been computed separately for two groups: one consisting of the seniors 
who were individually followed-up from their freshman year, and the 
other consisting of all seniors who were tested. 

The main point to consider in Table V is whether or not the seniors 
majoring in one subject tend to be more conservative or more liberal 
than seniors majoring in another subject. This can be studied most 
conveniently by running the eye down each column, noting the relative 
order in the attitude scores among the different major subjects. If this 
is done, several trends will be found which are consistent (in the “fol- 
low-up” and the “all senior” groups) and sufficiently pronounced to 
deserve special note.! First, it will be seen in the next to the last 





The conventional Diff./PEasr. method for determining reliability has not 
been employed here because our data do not satisfy the assumptions of that 
formula at all well. We have instead computed each average on the basis of our 
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column that on all attitudes combined the history-geography group is 
the most conservative, the economics-sociology group next, the 
English-language group next, and the natural science group the least 
conservative or the most liberal. The differences between groups are 
not large, but the trend is worthy of attention and especially the differ- 
ence between the history-geography and the natural science groups. 


TaBLe V.—AveEeRAGE ATTITUDE Scores or SENIORS OF DIFFERENT 
Major Sussects 











War Negro | Influence| Reality | Church | Total | Total 
reli- all 
Major subject gion, | scales,| N 
Aver- Aver- Aver- Aver- Aver- aver- | aver- 
age @ age e age @ age e age e age age 
Follow-up group. 
Natural science....... 7.47) .70) 7.53).49| 6.37|2.05) 6.04/2.15) 5.40)1.63) 5.94 | 6.56 |31 
English-language... ... 7.58|.55| 7.38|.90) 5.38/2. 4.79|1.51) 4.66/1.71| 4.94 | 5.96 /21 
Economics-sociology ..| 7.46).33| 6.54|.88| 5.48|2.32| 5.17|1.45) 4.32|1.22) 4.99 | 5.79 |10 
History-geography....| 6.65).55| 6.67|.15) 5.18|/2.1 | 5.10/2.85) 3.76) .90) 4.68 | 5.18 |12 
All Senior group 
Natural science....... 7.44)...| 7.32)...| &.53).. 5.08).. 5.10). 5.24 | 6.09 |62 
English-language..... . 7.45)...| 7.27)...| 5.60). 5.04)... 4.95). 5.20 | 6.06 |41 
Economics-sociology ..| 7.33)...| 7.00|...| 4.59). 4.19)... 4.64). 4.46 | 5.54 |19 
History-geography....| 6.72|...| 6.74...) 4.56).. 4.16). 4.27). 4.33 | 5.29 |27 












































Next let us break down these total attitudes into their component 
parts. We find the interesting fact that the history-geography group is 
more conservative than the natural science group in every attitude 
studied. The difference is more pronounced in attitude toward religion 
and the church than in the other attitudes. This can be seen best in 
the column designated ‘‘ total religion,”’ which is based on the results in 
the three columns preceding it. Here it will be noted that the history- 
geography majors are more conservative than the natural science majors 
by 1.26 points in the ‘‘follow-up” group and by .91 points in the “‘all 
senior” group. The attitudes of English-language and economics- 
sociology majors are in the middle ranges. 





follow-up group and then re-computed it on the basis of our total senior group, 
which practically doubles our number of cases, of course. If, in any given com- 
parison, the two results tell the same story, and if the differences in the follow-up 
group are appreciable in relation to the variability, we are calling that difference 
to the attention of the reader, who may if he wishes at any point apply 
the Diff./PEair. formula, using Q as roughly equivalent to PE dist.) 
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In attitude toward war again the history-geography group is the 
most conservative, while attitudes of the other three groups are about 
equal. In attitude toward the Negro there is no consistent difference 
between the history-grography majors and the economics-sociology 
majors, but these two groups are slightly but consistently more con- 
servative than the natural science and the English-language majors. 

The question immediately arises as to the causal factors accounting 
for these differences, particularly the difference between the history- 
geography majors and the natural science majors. Two possible 
factors which immediately come to mind are, first, the possible differ- 
ence in the intelligence of the students majoring in the two subject- 
matter fields, and, secondly, the possible difference in attitude of the 
teachers in these fields. A careful check was made to see if any signif- 
icant differences in intelligence existed. It was found that there was 
little difference in average intelligence from one subject-matter field to 
another, the average percentile scores being 54.2 in natural science, 
52.6 in English-languages, 50.8 in economics-sociology, and 51.8 in 
history-geography. In view of the smallness of these differences in 
intelligence and of the low relationship found between intelligence and 
attitudes, it seemed definite that these small differences in intelligence 
could not contribute in any appreciable way toward the differences in 
attitudes in these subject-matter groups. But in order to make certain 
of this we matched as many members as possible of the history group 
with members of the natural science group in such a way that the aver- 
age intelligence scores were equal, and then computed the averages on 
the basis of these revised distributions. The average attitudes on these 
matched groups were not changed from those reported in the table by as 
much as .1 of a point. 

In order to determine whether or not there were differences in the 
attitudes of the professors which might have influenced the students’ 
attitudes, the experimenter had the professors in the natural science and 
history-geography fields register their attitudes on war, religion, and 
the church by marking three of same attitude scales which had been 
submitted to the students. A total of thirteen professors anonymously 
filled out the scales, that is, seven in natural science, four in history, and 
two in geography. This comprised all the professors giving under- 
graduate instruction in natural science and history, and a representa- 
tion of the instructors in geography roughly proportional to the number 
of undergraduates majoring in that department. The average position 
on the attitude scale for the natural science professors was 5.93 on war, 
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5.53 on religion (only one scale used, namely, that on Influence of God 
on Conduct), and 4.10 on the church. The corresponding averages for 
the history-geography teachers were 7.28, 5.20, and 4.68, respectively. 
Thus it will be seen that in attitude toward war the history-geography 
group was somewhat more liberal than the natural science group 
as shown by the difference of 1.35 between the averages. In attitudes 
toward religion and the church the two groups of professors were very 
closely similar, the slightly greater liberalism of the science group on 
religion (.33 points) being a little more than cancelled by the slightly 
greater liberalism of the history-geography group (.58 points) on atti- 
tude toward the church. The general upshot of this comparison of the 
professors’ attitudes in the natural science and history-geography fields 
is that there seems to be nothing here which could account for the 
greater liberalism of students majoring in natural science in Coilege 
X. 

What, then, accounts for these differences, if intelligence and 
instructor’s attitude do not? We can only suggest a few logical con- 
siderations,! or guesses. One possibility which should be considered 
in connection with the difference in attitude toward religion and the 
church is that the science majors may have become, as a result of their 
studies, more conditioned to explanations in terms of natural law and 
more influenced by the theory of evolution than the history-geography 
majors.” Such differences would probably lead to greater liberalism on 
the part of the science group. 

With regard to attitude toward war, it may be that the important 
place which war has played in settling boundaries and in the molding 
of policies of nations in the past influenced the history-geography 
majors to doubt the wisdom of a high degree of pacifism* in the present 
stage of development of international economics and politics, while the 





1In using the expression “logical considerations” we do not mean to imply 
that attitudes are in any large measure influenced by logic. They probably result 
mainly from subtle conditioning where emotion plays a much larger réle than 
thinking. However, in studying the conditioning of students we may with profit 
consider the factors which logically seem capable of conditioning his attitude in 
this way or that. 

* More than two-thirds of the history-geography group were history majors. 

‘It will be remembered that the liberal end of the scale on attitude toward 
war was in the direction of pacifism mainly. If it had been more in the direction 
of international understanding and codperation to remove the causes of war, 
rather than in simply not fighting, it is quite possible that the attitudes of the 
history group would have been quite different. 
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study of the mechanisms of science and the realization of what it 
would mean if these were turned toward the destruction of life may have 
had some influence in convincing the science group of the impossibility 
of civilization’s standing up for any considerable length of time against 
the modern weapons of war. 

In attitude toward the Negro the history-geography group may 
have been more conscious of the social upheaval which would attend 
any radical change in the Negro’s present-day situation, while the 
science group may have had their attitudes less colored by social and 
political considerations and may, therefore, have had their attitudes 
represent a more detached and idealistic judgment. 

So much for the attitudes of students in the different major sub- 
jects as shown by the status scores of seniors. The differences in these 
status scores from major subject to major subject are interesting, but 
one cannot, of course, assume that these differences have been acquired 
exclusively during four years in college. In order to determine the 
influence of the major subjects during the four years of college, one 
must examine change scores, that is, changes from freshman to senior 
years. Let us turn next, therefore, to the study of such scores. Table 
VI gives the results based on the follow-up of all the students of two 
classes who were in college from the freshman through the senior year 
and who majored in the subjects here investigated. 


Taste VI.—AveraGe CHANGES IN THE ATTITUDES OF StruDENTS MAJORING IN 
THE DirFERENT SvuBJEcTS 








(Based on the Follow-up Group) 
War Negro | Influence Reality Church | Total! Total 
reli- | all 
Major subject gion, | scales,| N 
Aver- Aver- Aver- Aver- Aver- aver- | aver- 
age e age e age e age e age e age age 








Natural science. .... +1.12).48| + .42) .90)4+2.11)2.04) +1.92)1.96)+1.37) .88|+1.80) +1.30/31 
English-language....)}+ .94|.51|+ .20).75|+1.55/1.98|+ .89)1.11/+ .90)1.06/+1.11/+ .86/21 
Economics-sociology|+ .71|.58| — .58|.48)+2.25)1.27/+1.84) .85)+1.20) .85|+1.76|+1.09)10 
History-geography..|+ .30|.75| — .39).70|+1.47/2.55 .96|/2.30\+ .93) .35)+1.45)+ .85)12 












































The most significant fact shown by the table is that in all attitudes 
combined the natural science majors changed most in the direction of 
liberalism and the history-geography majors changed least. (See the 
next to the last column.) If we consider the changes in the different 
attitudes taken separately, we see that in attitude toward war the 
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science group changed most and the history-geography group least. 
In attitude toward the Negro the science majors changed most in the 
direction of liberalism and the economics-sociology and history- 
geography majors changed most in the opposite direction. In attitude 
toward religion and the church, as shown by the ‘“‘total religion” 
column, the science majors changed most, and the English-language 
majors least, and the history-geography majors next to the least. 

In considering these results it should be borne in mind that the 
individual figures are not highly reliable and that it is only the larger 
and the more consistent differences which are worthy of much confi- 
dence. The number of cases are small and the Q’s are large relative to 
the differences, which means that the reliability of any except the 
extreme difference will be comparatively low. However, in spite of 
this rather low reliability of most of the single differences in the table, 
the general trends seem to be dependable. The most dependable 
result—and the only one which we feel safe in stressing—is the differ- 
ence between the natural science and the history-geography groups. 

Political Preference.—In considering the variables which may be 
related in a significant way to the attitudes here studied one rather 
naturally includes political preferences. But this is a very complex 
variable and it is not possible even to define it in a satisfactory way 
psychologically. It does seem safe, however, to assume that an impor- 
tant part of an individual’s political preference or affiliation is a complex 
form of behavior which we call loyalty to some one organization that he 
thinks comes nearer than any other comparable organization in squar- 
ing with his host of attitudes toward economic and governmental 
matters. The question immediately arises as to whether we are in a 
position in this study to obtain any results involving such a complex 
variable that will have any significance. If we think of the problem in 
terms of the better psychological understanding of political preferences 





1 All the differences here reported were recomputed on the basis of the “‘all- 
senior-all-freshmen group,’”’ in which the number of seniors was one hundred 
forty-nine and the number of freshmen was one hundred eighty. The individ- 
ual figures are not identical with those in Table VI but the more important trends 
are essentially the same. This recomputation was made, of course, as a rough 
means of verifying the results based on the small follow-up group. We have 
greater confidence, however, in the individual results in the follow-up group in 
spite of its smallness, because here we are dealing with the same men as freshmen 
and as seniors, while in the other group the selection factor, especially as it is 
influenced by withdrawals of students between the freshman and senior years, 
is disturbing. 
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the answer is ‘“‘no.”’ But if we think in terms of more specific questions 
involving attitude behavior as we find it—such, for example, as giving 
the political ticket for which one would vote and registering likes and 
dislikes for a large number of statements on attitude scales—then there 
are at least two questions which can be attacked. . These are: (1) What 
is the relation between the political party to which one subscribes and 
his position on the conservative-radical continuum? That is, are 
individuals belonging to what is popularly called a very liberal or 
radical party, such as the Socialist or Communist, more liberal in 
attitude toward war, race, and religion than members of, say, the 
Republican party? (2) Are changes in attitude toward war, race, and 
religion greater in some political parties than in others? In other 
words, are students belonging to certain parties changed more by four 
years in college than those belonging to other parties? 


TaBLeE VII.—Averace Atrirupe Scores or Seniors or DIFFERENT 
PouiTicaAL Parrizs 





War Negro | Influence | Reality Chureh | Total | Total 














Political parties gion, | scales,| N 
Aver- Aver- Aver- Aver- Aver- aver- | aver- 
age @ age e age e age @ age @ age | age 
Follow-up group. 
Republican.......... 7.35 50] 7.13 40) 4.90/2. 4.80)1.52) 3.88|1.36) 4.53 | 5.61 (35 
Democratic......... 6. 48] .95 7.07| .70) 6.42/1.46) 5.85/2.43 4.90]1.40] 5.72 | 6.14 | 9 
Independent......... 6.95) .31| 6.34/1.46| 5.79/1.86) 5.54)1.97) 4.04)1.11| 5.12 | 5.73 | 9 
Socialist-communist..| 7.95|.85| 8.09| .59| 7.21|1.40] 6.65)1.80) 7.84/1.55| 7.23 | 7.55 |12 
All senior group. 
Republican.......... 7.10 6.85). . 4.65). 4.33). 3.96).. 4.31 | 5.38 |55 
Democratic......... 6.80|...| 7.31).. 4.48). 4.35).. 4.25).. 4.36 | 5.44 |24 
Independent......... 7.34)...| 6.93).. 4.55). 4.36).. 4.32).. 4.41 | 5.50 {20 
Socialist-communist..| 7.86)...| 8.07].. 6.52). 5.75). . 6.61)... 6.29 | 6.96 |32 












































Our data on these two questions are given in Tables VII and VIII. 
Table VII gives the average attitude scores for Republicans, Independ- 
ents, Democrats, and Socialists-Communists' in their senior year in 
college. The upper half of the table is based on the “follow-up” 
group and the lower half is based on all seniors tested. 





1The Socialists and Communists are combined not on the assumption of 
anything like identity in points of view, but because there were too few cases to 
justify two separate groups, and it is assumed that less injustice was done by 
combining them in this manner than in omitting either or in combining either 
with any of the other groups. 
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In studying the results in this table the reader will find a certain 
degree of disagreement between the figures in the two halves. This 
is due to differences in the samplings and to the unreliability of the 
measurements. There is, however, complete agreement that the 
Socialist-Communist party is on the average the most liberal (or radical) 
of the four in every attitude measured. This group differs from the 
others most in attitude toward religion and least in attitude toward war. 
There is no consistent tendency for any one of the three remaining 
parties to be more liberal or more conservative than the rest. 


Taste VIII.—Averace CHANGE IN ATTITUDES oF STUDENTS OF DIFFERENT 
PourTIcAL PARTIES 
(Based on Follow-up Group) 





















































War Negro | Influence Reality Church | Total,| Total, 
reli- all 
Political party gion, | scales,| NV 
Aver- Aver- Aver- Aver- Aver- aver- | aver- 
age @ age Q age W age @ age @ age age 
Republican........ +1.03).68)/—.15) .70|+1.85'2.02/+1.36) .92/+ .91) .90,|+1.37|+1.00/35 
Democratic....... + .05).84) + .28)1.40) +2.96/2.10, +2.52/2.11| +1.53)/1.65| +2.34|+1.47| 9 
Independent...... + .76).80) — .15)1.39| +2.57|1.89) +2.32)2.12)+ .81| .99|+1.90,'+1.26) 9 
Bocialist-commu- 
Meetunccae sak + v8 58/+.60!} .66;— .03)1.85)+ .94/2.00|/+1.52)1.74'+ .81;/+ .80/12 





So much for the most obvious result in the table. There are two 
other points, however, which should be mentioned in passing. The first 
is that there is a great deal of overlapping in the attitudes of students of 
all parties. This leads us to the conclusion that the political preference 
of a person is a poor index of his degree of liberalism in general. The 
second point is closely related to this. There is not only overlapping 
between members of different parties in attitude toward any one prob- 
lem, say war, but there is also much unevenness in the average student’s 
position on the conservative-radical continuum from attitude to atti- 
tude. This will be particularly stressed in connection with another 
problem later, but in working up the original data for Table VII one 
could not but be impressed with the degree to which the members 
of the different political parties swing up and down the conservative- 
radical scale as they move from one attitude to another. This fact 
points to the fallacy so often committed of assuming that persons 
subscribing to an organization which is radical or very conservative in 
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attitude toward one set of problems will be similarly radical or similarly 
conservative in attitude toward other problems. 

We come now to Table VIII which shows for each political group the 
amount of change in attitudes during four yearsin college. In studying 
these results it is again important to remember that the change scores 
are not highly reliable, and therefore in comparing the changes in the 
various groups attention should be given only to general trends running 
through many specific results. The main result or trend which stands 
out in this table is that average change scores (as shown in the next to 
the last column) are closely similar from group to group. In one 
attitude one group changed most while in another attitude another 
changed most. This result is quite different from that based on status 
scores as shown in Table VII. There it was found that the Socialist- 
Communist group was by far the most radical, but here it is seen that 
we cannot conclude that such students have been more susceptible to 
change in attitude during their college years than others. 

Religious Affiliation—What was said at the beginning of the last 
section about the complexity of political preference as a variable applies 
also to religious affiliation. And here we shall confine ourselves, just 
as in the last section, to a consideration of gross relationships between 
the variable and the various attitudes. In the main we shall be con- 
cerned with two questions. (1) Were the students of some religious 
groups notably more conservative or more liberal than those of other 
groups? (2) Did some groups show greater changes in attitude during 
four years of college than others? 

The results on the first question, based on the seniors of the Catholic, 
Protestant, and Jewish faiths, are given in Table IX. Two sets of 
results are given: first, those based on the seniors of the follow-up 
group, and, secondly, those based on all seniors tested. It will be seen 
at places that the agreement between the absolute scores of the two 
halves of the table is not very close. However, it is the differences 
among the averages of the three groups, and especially the relative 

1 Indeed if there be any differential trend in the different political groups it is 
in the opposite direction. Study of the changes in each attitude in the table will 
show that the Socialist-Communist group changed less during four years of college 
than one or more other groups in every attitude except that toward the Negro, 
and in two out of five attitudes it changed less than any other group. It is pos- 
sible that this means that the total social group in college acts as a retarding 
influence on the cultivation of attitudes too far above the mode of; prevailing 


attitudes, and at the same time specially stimulate changes in attitudes at levels 
far below the mode. 
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order of the averages from group to group, which we are particularly 
interested in here. In this relative order there is almost complete 
agreement between the results in the two halves. 

Table [X shows that the Jewish group obtained the highest or most 
liberal scores in every attitude. The Catholic group, on the other hand, 
was the most conservative in seven out of ten comparisons; and if we 
consider only war and religion this group was the more conservative in 
seven out of eight comparisons. The largest differences among the 
three groups, as might be expected, were found in attitude toward 
religion. Averages on the three scales dealing with religion and the 
church in the follow-up group (see “‘total religion”’ column) show the 
Jewish group to be more liberal or radical than the Protestant by 1.37 
points and more radical than the Catholic by 1.73. The corresponding 
differences in the all-senior group were 1.02 and 1.90, respectively. 
It is interesting to note that of the three scales on religious attitudes the 
Jewish group differed from the other groups most on the scale dealing 
with the church as an organization. 
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In attitudes toward war and the Negro the differences between 
groups were allsmall. It is true that in both these attitudes the Jewish 
group was somewhat more liberal than the others, but the differences 
were not reliable statistically. No differences were found consistently 
between the Protestants and Catholics. 

The second main question to be dealt with in this section is that 
with regard to changes in attitude during the college period. Table X 
summarizes the results on this problem. The main fact revealed is 
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that the Jewish group, which in status score (in the senior year) 
was more liberal than any other group, changed less during college than 
any other group in every attitude except that toward the Negro, and 
there the change was insignificant. This tendency toward negative 
relationship between status scores and change scores is similar to that 
found in the last section. This relation reminds one of the negative 
relation usually found between status score (usually initial score) 
and improvement in achievement testing and in learning problems. 
However, it is hard to see how the explanations ordinarily offered for 
such cases would apply here. The “‘ceiling”’ in each attitude scale was 
quite high, there being practically no student who scored nearer than 
1.5 points of the top. The best hypothesis which can be proposed here 
is that suggested briefly in the last section, namely that the total social 
group of which one is a part, with its prevailing attitudes within a cer- 
tain range, may act as a brake on the increase in liberalism or radicalism 
on the part of individuals far above the mode, and at the same time 
specially accelerate changes in attitudes from levels far more conserva- 
tive than the mode. Moreover the instruction and reading of a student 
probably would be progressively less and less likely to stimulate him to 
greater liberalism or radicalism as he is already further and further 
along in that direction. 


Taste X.—AverRAGE CHANGE IN ATTITUDES FoR SrupENnTs oF DIFFERENT 
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The above result is the most striking and probably the most signif- 
icant one in the table. There are two other points, however, which 
will be mentioned in passing. First, it will be noted here, just as in 
the corresponding table based on political preferences, that preponder- 
antly positive changes were found in every group. We can generalize 
further and say, on the basis of this result and of other similar ones 
obtained, that no group was found which seemed characteristically 
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resistant to change in attitudes. If we should set out to select the 
students entering college whose attitudes are most likely to be changed 
least, we would select them not on the basis of the organizations to 
which they belong but on the basis of their attitudes on specific 
problems. 

The second point to be mentioned is that the largest changes in 
the Protestant and Catholic groups were made in attitude toward 
religion and the church. The average change in these two groups in 
this attitude was 1.77 as compared with an average change of .53 in 
attitudes toward war and the Negro. The dispersion in the change 
scores in attitude toward religion was also quite large, showing that 
some students changed greatly in the direction of liberalism or radical- 
ism while others changed very little and a few changed in the direction 
of conservatism. 


III. CONSISTENCY IN LIBERALISM OR CONSERVATISM IN DIFFERENT 
ATTITUDES 


The final data to be presented in this study deal with the question 
as to the extent to which an individual is consistently liberal or con- 
sistently conservative from attitude to attitude. To state the problem 
differently: To what degree does liberalism or conservatism function in 
such a way as to make for high correlations among scores on different 
attitude scales? Is the concept of all-round liberalicm or all-round 
conservatism tenable? In order to study this problem of consistency 
or intercorrelation among attitude scores, we have correlated each 
attitude with almost every other,' basing the results on freshmen and 
seniors separately. In addition to these r’s of the zero order we have 
computed several partial r’s so as to determine the correiation between 
attitudes when the influence of intelligence was partialed out. The 
correlations are given in Table XI. 

The table is divided into three parts. The first part deals with 
the correlations between pairs of attitudes where the members of each 
pair seem logically to be quite unrelated to each other. Here, for 
example, we have the correlations between attitude toward war and the 





The correlations have been computed in every case where there was any 
chance that a new trend would be revealed. The only r’s omitted in the freshman 
group are those between attitude toward the Negro and two of the measures of 
attitude toward religion and the church. We had already sampled that relation 
by means of one of the scales, and there was no chance that the other r’s would 
have been essentially different from this one. 
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three different measures of attitude toward religion and the church, 
and also the r’s between Negro and war, and Negro and church. In 
the second part are grouped the correlations between pairs of attitudes 
which seem logically to be very closely related. Here all attitudes 
correlated deal with religion or the church. The third part of the table 
is devoted to partial r’s. 

It will be seen that the correlations in the first part of the table 
range from +.27 to —.05 for freshmen. The corresponding 7’s based 
on seniors range from +.39 to +.23. That is, the r’s of both freshmen 


and seniors are in complete agreement in showing that there is low — 


relationship between the degree of liberalism in one field and that in 
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1 This r, when based upon the freshmen of 1934 and 1935 combined (N = 172), 
was +.07. 


another. If we partial out the influence of general intelligence on the 
first three pairs of variables, we find that the r’s remain very low, being 
+.19, +.14, and +.26, respectively, for war vs. church, war vs. 
influence, and war vs. reality. 
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In contrast to these low correlations in the first part of the table are 
the relatively high ones in the second part—that is, the r’s among the 
different measures of attitude within the same general field of religion 
and the church. It will be noted that here the r’s range from .72 to 
.82 for freshmen, and from .78 to .88 for seniors. The corresponding 
partial correlations (with intelligence held constant) for freshmen range 
from .70 to .81. 

What is the meaning of these low r’s in the first part of the table, and 
the relatively high ones in the second part? The former mean that a 
given degree of liberalism in attitude toward one question or issue does 
not guarantee a similar degree of liberalism toward an essentially 
different question. The fact that an individual is a conservative in one 
field, say in attitude toward the church, reduces only slightly his 
chances of being a liberal in another field, say in attitude toward war. 
These r’s, therefore, argue against any assumption that radicalism or 
conservatism acts as a strong general factor making the individual an 
all-round radical or an all-round conservative. This is the result when 
the r’s among attitudes in different fields are considered. The r’s 
in the second part of the table are within one field—the field of religion 
and the church—and the fact that they are high means that within a 
given field of attitudes there is a great deal of consistency in one’s 
liberalism or conservatism. 

In using the word “field” above we do not mean to imply that the 
boundaries of certain logical areas in which r’s would be high are known, 
nor is there any intention of implying that even if certain different 
“fields” were known the r’s within the different ones would be very 
similar. All that we know is that the degree to which students are 
consistently conservative, liberal, or radical depends largely upon areas 
in which they are tested. Of course, if we take the positions of an 
individual on the conservative-radical scale on many different issues 
we get a single position which may be more radical or more conservative 
than the corresponding average position of another individual, but we 
must not allow this averaging process to conceal the fact that normally 
the individual himself fluctuates back and forth widely on the conserva- 
tive-radical continuum when measured on different issues. Individ- 
uals, therefore, cannot be properly classified as all-round radicals, 
liberals, or conservatives. ‘They must be considered rather as radical, 
liberal, or conservative in ‘‘spots.’”” The real question in the diagnosis 
of any individual’s attitudes, therefore, becomes one of what, how big, 
and how many are the “‘spots” or “fields” in which he is radical, liberal, 
and conservative. 
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A second result in the table which deserves mention is the fact that 
the r’s based on seniors are in every case higher than the corresponding 
ones based on freshmen. The median difference is.11. This seems to 
mean that consistency in the liberalism or conservatism of students 
increased somewhat with maturity. 

These two findings—the decidedly higher r’s in the second part of 
the table than in the first and the higher r’s for the seniors than for the 
freshmen—have very direct bearing upon the specificity-generality 
problem as it applies to the mental organization involved in attitudes. 
Some writers take the view that attitudes are specific. One representa- 
tive of this view, for example, says that: “‘ Attitudes are as numerous as 
the objects to which a person responds.”"! Others believe that attitudes 
represent ‘broadly generalized dispositions.” The results of the pres- 
ent study indicate, however, that neither thoroughgoing specificity 
nor thoroughgoing generality applies in the case or attitude here 
investigated. The degree to which the liberalism or conservatism of an 
individual in one field of problems spreads over to another depends 
upon the fields involved and also upon the nature and stage of develop- 
ment of the individual. Instead of complete specificity or complete 
generality in the attitudes here measured we find only a mixture of the 
two, each supplementing the other. Both generality and specificity 
it seems must be thought of as matters of degree. Among attitudes in 
such different fields as religion, war, and race we find a low degree of 
generality and a high degree of specificity, whereas among different 
attitudes in the field of religion and the church we find a rather high 
degree of generality and a low degree of specificity. Thus we are led 
to the general conclusion that attitudes may be thought of pictorially 
as organized into something like clusters or constellations with hazy 
boundaries, and that within a given constellation the inter-r’s among 
attitudes are high, indicating a great deal of consistency in a person’s 
position on the conservative-liberal continuum in that area.» But 
among different constellations there is much less consistency. The 
amount of consistency within and between constellations will vary 
greatly, we suspect, with the different constellations. 


SUMMARY AND CONCLUSIONS 


1. The study extended over a six-year period and included a fol- 
low-up of two college classes from freshmen through senior year. The 
total number of classes tested, including retestings, was eleven. Five 





1 Bogardus, E. 8.; Fundamentals of Social Psychology (2nd Edition). New 
York: Century, 1931, see p. 54. 
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of the Thurstone attitude scales were used: Attitude toward war, 
Attitude toward the Negro, Attitude toward religion (two scales), 
and Attitude toward the church. 

2. The average change in attitude from freshmen to senior year was 
in the direction of liberalism in every attitude scale except that toward 
the Negro, where there was nochange. The changes in attitude toward 
war and toward religion and the church were reliable statistically, but 
in an absolute sense they were small. Far from the changes being 
from militarism to pacifism or from fundamentalism to atheism they 
were only changes from moderate liberalism to a little more liberalism. 

3. A tendency was found for high intelligence and liberal attitudes 
to go together on the average, but the r’s were very low in the case of 
every attitude studied. The median of the eight r’s computed between 
intelligence and war was +.03 and the median of seventeen r’s between 
intelligence and religion and the church was +.15. Correlations were 
computed separately for Protestant, Catholic and Jewish groups and 
these were somewhat lower on the average than the r’s based on the 
composite group. All ourr’s stress the futility of attempting to predict 
the conservatism or liberalism of students from their intelligence scores, 
at least within the range of intelligence and attitude scores here studied. 
Moreover they point to the fallacy, not infrequently committed, of 
attempting to judge the intellectual level of an individual or a group by 
the degree of conservatism, liberalism, or radicalism manifested. 

4. In the study of conservatism and liberalism of attitude among 
students of different major subjects no extremely large differences were 
found. However, it is interesting to note that at the end of the senior 
year the natural science majors were the most liberal on the average 
and the history-geography majors the most conservative. The atti- 
tudes of the English-language and the economics-sociology majors 
were in the middle ranges between these two. Also, in point of change 
in liberalism between freshman and senior years the natural science 
majors were highest and the history-geography majors lowest. Possi- 
ble explanations for these differences have been proposed in the case of 
each attitude. 

5. Of the four political parties studied the Socialist-Communist was 
the most liberal or radical on every attitude studied. There was no 
consistent tendency for any of the others (Democratic, Independent, 
and Republican) to be more conservative or liberal than the rest. 
There was a great deal of overlapping between members of all these 
political parties in attitude toward each issue studied. Moreover, 
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there was much unevenness in the average student’s position on the 
conservative-radical scale from attitude to attitude. This leads to 
the generalization that persons subscribing to a political party or other 
organizations which are radical or conservative in attitude toward one 
set of problems, say economic, will not necessarily be similarly radical 
or conservative in attitude toward other problems. 

6. In comparing the students of different church affiliations (Protes- 
tant, Catholic, Jewish) it was found that the Jews were the most liberal 
in every attitude studied. The Catholic group was the most conserva- 
tive in seven out of ten comparisons.’ If we consider only war and 
religion and the church, this group was the most conservative in seven 
out of eight comparisons. The largest differences among the three 
groups, as might be expected, were found in attitude toward religion 
and the church. 

7. In the case of both the religious and the political groups a nega- 
tive relation was found between status and change scores. That is, 
those groups which were most liberal at a given time, say as freshmen, 
changed in the direction of liberalism least during four years in college. 
The best hypothesis which we can propose for this is that the total 
social group of which a student is a part retards any changes toward 
liberalism or radicalism if he is far above the mean of the group, and 
may specially accelerate such changes if he is far below the mean. 
Moreover, the very liberal or radical student may receive through his 
reading and class instruction less stimulation to change than does the 
conservative student in that he may already be up to or beyond the 
point of view represented in most of his instruction. 

8. In the study of the intercorrelations among the different attitudes 
it was found that the r’s among attitudes in different fields like war, 
race, and religion, were positive but small, averaging +.15 for freshmen 
and +.31 for seniors, whereas the r’s among the three different attitude 
scales based on the field of religion and the church were much higher, 
averaging +.75 for the freshmen and +.82 for the seniors. These 
intercorrelations, especially those among different fields, argue against 
any assumption that radicalism or conservatism acts as a strong general 
factor making the individual an all-round radical or an all-round con- 
servative. The degree to which students will be consistently conserva- 
tive, liberal, or radical will depend largely upon the fields in which 
they are tested. The term conservative or liberal is not a term to be 
applied to the individual in any all-round sense. Instead of being all- 
round liberal or all-round conservative it seems that individuals. are 
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liberal and conservative in ‘‘spots.”’ Of course, on the average, one 
individual stands higher on the conservative-liberal continuum than 
another, but the process of averaging conceals the marked degree to 
which an individual usually fluctuates back and forth on this continuum 
as he is measured from attitude to attitude. 

These results have very definite bearing upon the specificity- 
generality controversy concerning the mental organization involved in 
attitudes. It seems in light of these findings that both specificity and 
generality must be thought of as matters of degree. Among “fields”’ 
as different as religion, race, and war there is a high degree of specificity 
in conservatism or liberalism, whereas, among different measures of 
attitude in the ‘‘field’’ of religion and the church there is a relatively 
high degree of generality and, consequently, a low degree of specificity. 
It seems, therefore, that attitudes may best be thought of as organized 
into something like clusters or constellations with hazy boundaries, 
and that within a given cluster the intercorrelations among measures of 
conservatism-liberalism are high, indicating much consistency, but 
that among different clusters there may be much or very little 
consistency. 

As a practical matter of education in attitudes one might inquire as 
to which of these two concepts, specificity or generality, deserves the 
greater emphasis. It appears to be definite that the degree to which 
attitudes are specific is the more unexpected fact and the one, perhaps, 
for this reason, which deserves the greater stress. It is hard for the 
layman to believe that liberalism or conservatism of a student in one 
field, and particularly the change in attitude in one field, transfers or 
spreads to other fields to such a slight degree as it does. In orderto 
increase the degree of generality in the attitudes and conduct of students 
teachers must direct their education toward generalization, rather than 
assume that whenever improvement is made in one area it will spread 
widely by some automatic generalizing process. 








A COMPARISON OF THE VOCABULARIES OF 
ANGLO-AMERICAN AND SPANISH-AMERICAN 
HIGH-SCHOOL PUPILS 


LOAZ. W. JOHNSON 
Berkeley, California 


INTRODUCTION 


During the past few years the trend of Spanish-American popula- 
tion in the United States has been definitely upward. According to 
the 1926 census there were 700,541, and according to the 1930 census 
there were 1,422,533, Spanish-speaking people in the United States, 
chiefly in the Southwest. The 1929-1930 school census of Grant 
County, New Mexico—the particular region of which this study 
treats—showed over sixty-one per cent Spanish-speaking. The sum- 
mary of tests given in the rural schools of Grant County, September, 
1935, showed the potential Spanish-American school population still to 
be in excess of the Anglo-American. 

This rapid increase in Spanish-American population in the South- 
west has greatly augmented the school problems of this region. Some 
investigations have been made to determine the nature and extent of 
these problems. These studies have focused around the hereditary, 
the environmental, and the linguistic aspects of the problem, and have 
been somewhat conflicting in results. However, there is general 
accord that, when they are measured by the devices and standards used 
in the schools of the United States, the Spanish-American pupils are 
greatly handicapped. 

This is but natural because the Spanish-American is of a different 
race, His motives, his tendencies, his philosophy of life, and his 
customs are very different from those of the Anglo-American. And 
since he uses a different language early in life, his idioms of thought 
must necessarily be different. His desire to be among his own people, 
his care-free attitude, and his desire for unusual, dramatic, and even 
reckless action, sometimes at the expense of life, make the Spanish- 
American’s problems different from those of Anglo-Americans. Inves- 
tigations thus far have not gathered anything like sufficient information 
for a satisfactory and fair solution of this educational problem. 


THE PROBLEM 


Apparently there is a need to know more of the possible language 
handicaps of the Spanish-American pupils. After checking the avail- 
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able findings, it was concluded that future studies should be of a more 
specific nature. A vocabulary study for comparing Spanish-American 
with Anglo-American pupils, as far as could be determined, had not 
been made. This evident lack of information and the felt need for it 
led to this study. 

While a vocabulary study treats directly of the linguistic phase of 
the problem, it is almost as closely related to the environmental, and is 
indirectly tied up with the hereditary aspects of the problem. The 
Spanish-Americans are more or less isolated in quarters of their own. 
They do not use public libraries and other facilities for gathering a 
broader store of knowledge as freely as Anglo-Americans. The 
Spanish-Americans do not make the contacts which would tend to 
increase their vocabularies along lines of common use by Anglo- 
Americans. When at home or with others of his kind the Spanish- 
American practically always converses in his native tongue. These 
customs and practices naturally limit him in his ability to perform on 
tests, either for intelligence or achievement, which have the information 
expressed in symbols foreign to him. There is need for information 
which shows just how much he is limited in the formulating of new 
thoughts because of his lack of understanding of specific words. 


THE OBJECTIVES 


In accordance with this felt need, a study was made of the 
vocabularies of the high-school pupils of Grant County, New Mexico, 
the school year of 1935-1936, with the following objectives: 

1. To compare the vocabularies, as revealed by certain tests, of 
groups of Spanish-American high-school pupils with the vocabularies 
of groups of Anglo-American high-school pupils. 

2. As far as the test results reveal, to determine and compare the 
increase in vocabularies made by the relative groups. 

3. To offer interpretations of the findings as they pertain to the 
bilingual problems of the Southwest. 


PROCEDURES USED IN THE STUDY 


Since this study was to be confined largely to the high school, it 
was thought best to get a check on the entering freshmen. This 
gave an opportunity to compare the two groups before the high school 
had any influence upon them. A different form of the test was given 
to these freshmen and to all other high-school classes at the end 





-— ~ Dm Fe KH mh st =~ 


os «We 





— -_ _ we = * 





A Comparison of Vocabularies 137 


of the year. In this manner it was possible to get a comparison of 
the groups at the various high-school levels. 

A search was made for vocabulary tests by subjects, but none could 
be found. None of the major publishers of standardized tests in the 
United States could furnish such tests or give information leading to 
them. Consequently, it was decided to use The Inglis Tests of English 
Vocabulary as a general test and to construct subject tests for use with 
the freshmen in New Mexico State Teachers College High School for 
locating any special phases of the problem. 

Four vocabulary tests were constructed, one in each of the following 
solids: General Science, Grammar-Composition, Social Science, and 
Mathematics. In building the General Science test approximately 
two hundred words were listed from the indexes of the basic and refer- 
ence texts being used in the freshmen year. Each of three teachers in 
this field checked independently the words on the original list which he 
considered necessary to a reasonable understanding of General Science 
on that particular level. The words which all three checked, one 
hundred twenty-one in number, were used in the vocabulary test for 
this subject. The original test was administered to pupils who were 
not involved in the experiment and then revised before being used in 
the testing program. The tests in the other subjects were constructed 
inasimilar manner. No attempt was made to arrive at norms. 


THE RESULTS 


To get a better understanding of the situation, the norms for 
the Inglis Tests should be available. They are presented in connection 
with Table I. Unfortunately, the description of these tests did not 
state whether these norms were for the beginning or ending of the year. 
Since tests were given both at the beginning and ending of the year, 
comparisons will be made for both times, but the latter will be con- 
sidered basic. Each test has one hundred fifty words and the score is 
represented by the number right. 

Table I is designed to give a general picture of the bilingual prob- 
lems in the high schools of Grant County as revealed by The Inglis 
Tests of English Vocabulary. In this table no attempt was made to 
show the relative progress of any two groups over a given period of 
time, but the purpose was to give the relative standings of the two 
groups on each class level at a given time. The results of Form A 
given to the freshmen at the beginning of the year were included to 
enlarge the picture and to show the standings of entering freshmen. 
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The critical ratios of the differences between means range from 
4.00 to 7.85. This means that it is practically certain that any 
group of Anglo-American pupils chosen at random would excel any 
group of Spanish-American pupils chosen in the same manner. Conse- 
quently, the question arises as to whether the Anglo-American pupils 
are greatly excelling what they should do or whether the Spanish- 
Americans are falling short of what they should do. 


TaBLE I.—TuHe Mean Scores AND THEIR STATISTICAL EVALUATION FOR The 
Inglis Tests of English Vocabulary, Form A Given TO HIGH-SCHOOL FRESHMEN 
In Grant County, New Mexico, In SEPTEMBER, 1935, Form B GIvEn 
TO FRESHMEN, SOPHOMORES, JUNIORS, AND SENIORS IN May, 1936! 




















Mean Dif- 
Num-| chron- — — fer- | SD | SE - Crit- 
Classes Groups ber of | olog- ences of of . ical 
; . Form | Form : differ- - 
pupils| ical in means| means ratios 
A B ences 
ages means 
1 2 3 4 5 6 7 8 9 10 11 
Freshman, Sept., | Anglo- 
1935, test. American | 127 cane a  schewk cance 17.45) 1.55 
Spanish- 

American 44 pees nn Go wwad 18.45) 11.60) 1.76 | 2.35 | 7.85 
Freshman, May, | Anglo...... 112 15-1 | ..... 65.35) ..... 18.30} 1.73 

1936, test. Spanish....| 38 15-9 | ..... 50.80) 14.55) 13.50) 2.18 | 2.78 | 5.23 
Sophomores, Anglo...... 92 16-1 | ..... 66.45) ..... 21.80) 2.27 

May, 1936, test.| Spanish....| 38 SE DD in eae 46.60) 19.85) 12.00) 1.93 | 2.98 | 6.66 
Juniors, May, Anglo...... 80 17-4 | ..... (| 18.50; 2.08 

1936, tests. Spanish....| 23 . SP 52.50} 17.85) 14.35) 2.99 | 3.64 | 4.88 
Seniors, May, Anglo...... 80 ene saan 76.80) ..... 19.35) 2.17 

1936, Test. Spanish....} 22 Sere «sees 60.25) 16.55) 16.53) 3.54 | 4.14 | 4.00 



































1 Norms for The Inglis Tests of English Vocabulary. 


5 si dir ahold aie ie aii Forty-five words, or 30 per cent 
ven t6bbb ascend kaeead Sixty-three words, or forty-two per cent 
Ee er Seventy-eight words, or fifty-two per cent 
sb kineneedhaeeenede Eighty-seven words, or fifty-eight per cent 
College-freshmen.................... One hundred five words, or seventy per cent 


The answer to the above question may be found by comparing the 
data in Table I with the norms accompanying this table. It will be 
noticed that the mean for beginning Anglo-American freshmen was 
approximately thirteen points above the ninth-grade norm, and only 
about five points below the tenth-grade norm. However, the mean for 
the beginning Spanish-American freshmen was a fraction under six 
points below the norm. At the end of the year the mean for Anglo- 
American freshmen was two points above the tenth-grade norm, and 
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the mean of the Spanish-American freshmen was approximately six 
points above the ninth-grade norm. However, this good showing does 
not continue throughout the grades. The sophomore, relatively speak- 
ing, dropped some; the juniors, just a little more; and the seniors, a 
great deal. In fact, the graduating Anglo-American seniors dropped a 
point below eleventh-grade norm, and the graduating Spanish-Ameri- 
can seniors dropped about three points below tenth-grade norm. The 
mean for graduating Spanish-American seniors was only about three 
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Grapu I.—The Inglis tests of English vocabulary, given September, 1935. 
Entering Spanish-American freshmen. 
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points above the mean for beginning Anglo-American freshmen. 
These figures not only show a great handicap of Spanish-American 
pupils, but also support a common statement of today; namely, that 
very little direct vocabulary work is being done in the high schools. 

Perhaps attention should be called to the absence of mental ages 
and the range of chronological ages. Mental ages of all the pupils 
were not available, consequently this factor had to be omitted. The 
means of the chronological ages show the Spanish-American groups to 
be from seven to twelve months older than the corresponding Anglo- 
American groups. 
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It will be noticed in Graph I that entering Anglo-American fresh- 
men range from twenty-five points below to sixty points above the 
freshman norm of forty-five. Of course those of the lower range could 
not be expected to do anything like average freshman work, while 
those in the upper range approached the college freshman norm. Nor 
could very many from the Spanish-American group be expected to do 
average freshman work since most of them fell below the freshman norm. 


TABLE IJ.—MEAN Scores AND THEIR STATISTICAL EVALUATION FOR T'eacher-made 
Subject Vocabulary Tests GIVEN TO THE FRESHMEN IN NEw Mexico Strate 
TEACHERS COLLEGE HicH ScHOOL 















































Num- Dif- SE 
Num-| ber of ae fer- SD SE a Crit- 
Subjects Groups ber of | words ences of of : ical 
, ; scores ° differ- : 
pupils; in in means; means ratios 
ences 
test means 
1 2 3 4 5 6 7 8 Q 10 
General science. Anglo- 
Americans...... 66 ... | 78.18) .... | 19.80) 2.40 
Spanish- 
Americans...... 27 121 68.80 9.30 | 13.30) 2.37 | 3.37 | 2.76 
Grammar-composi-| Anglo........... 66 ves | COL oooe | OE. See 
tion. ee 27 114 | 66.00; 9.30 9.90) 1.90 | 2.58 | 3.61 
Social science. EN RR: 66 ... | 78.70| .... | 14.40) 1.78 
Spanish......... 27 112 70.45 8.25 | 12.55) 2.36 | 2.95 | 2.79 
Mathematics. NS a bs keine eel 66 ... | 84.35) .... | 15.10) 1.86 
Spanish......... 27 114 _— 8.35 | 12.55) 2.36 | 3.00 | 2.78 





Table II gives the results of the subject vocabulary tests. Since 
there are no norms with which to compare the different groups, all 
comparisons will have to be made between the groups themselves. 
Practically the same story is told by these results as by the results of 
the Inglis Tests, except that it is slightly toned down in this case. The 
Anglo-American group excelled in every instance, but the critical 
ratios of the difference of means are not so significant as in the general 
test results. The close range of these critical ratios seems to indicate 
that the Spanish-American group is about equally handicapped in each 
of the four solids. 

The slight narrowing of the margin between these groups as shown 
by the subject tests may have been caused by the limited number of 
cases, or by the conditions under which the tests were administered. 
If it was the latter, it may have some significance. These tests were 
administered at the beginning of the second semester. Though the 
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teachers of the various subjects made no special efforts at teaching 
vocabularies, the fact that most of the Spanish-American pupils were 
taking these subjects necessitated their coming in contact with many 
of the test words; while their life habits, as explained in the beginning 
paragraphs of this paper, would naturally limit their contacts with the 
words which would be found in a general English vocabulary test. If 
these suppositions could be substantiated, they should offer valuable 
suggestions to curriculum-builders in the Southwest. 
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Grapu II.—Vocabulary test in general science. 
Spanish-Americans. 
wi a ead a Anglo-Americans. 





mee me ny 
an eend 

















eoese 
‘ 
r 


ansA 
































Mean fi A. 























Graphs II-IV are presented to show the distribution and overlap of 
the groups on the different subject tests. It will be noticed that very 
few Spanish-Americans are found above the mean of the Anglo- 
American group on any of the tests. Apparently they made their 
best showing on the Mathematics Test. A count of the rights and 
wrongs revealed that the Spanish-American boys excelled the girls by a 
few points on the General Science Test and the girls excelled the boys 
on the Mathematics Test. Neither these nor any other differences 
located by the count justify graphic representation. 

It had not been the purpose in this study to compare the progress 
of two groups over a period of time, but since one of the major objectives 
was to find any information pertinent to the bilingual problem, it was 
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concluded that such a comparison might be worth while. Table III 
is introduced for the purpose of comparing, in a fashion, the relative 
progress of groups. This table shows that the Spanish-American gain 
in means from September, 1935, to May, 1936, was 3.90 greater than 
the gain of the Anglo-Americans. The critical ratio of the difference 
between means was 1.07, which does not indicate a reliable difference. 
However, since the chances are about two to one in favor of the Spanish- 
Americans, the difference may have some significance. 


CONCLUSIONS 


On the basis of the evidence gathered and presented only a few 
definite conclusions can be made. These with the qualified conclu- 
sions, all of which pertain to the schools of Grant County, New Mexico, 
are presented as follows: 

1. Spanish-American high-school pupils labor with a definite 
vocabulary handicap as compared with Anglo-Americans in the same 
schools and as compared with the norms for The Inglis Tests of English 
Vocabulary. 

2. Spanish-American high-school pupils are retarded from seven to 
twelve months as compared with Anglo-American pupils. 


Taste IIJ].—Comparison OF THE ProGRESS MaprE By ANGLO-AMERICAN AND 
SPANISH-AMERICAN FRESHMEN IN THE HicH ScHoots or Grant Covunrry, 
New Mexico, as SHown sy The Inglis Tests of English Vocabulary 














Number Mean | Mean Gain in SD of means 
pupils . mean 
scores.| scores.| Gain , 
Form | Form in scores of SE | Crit- 
Groups A B monn Spanish- of ical 
Sept.,| May, Sept. | May | scores over Sept.,| May, | gains | ratio 
1935 | 1936 test test Anglo- 1935 | 1936 
test test Americans| test test 
1 2 3 4 5 6 7 8 Q 10 ll 
Anglo-American...| 127 112 57.90| 65.35) 7.45 givers 17.45) 18.30 
Bpanish-American.| 44 38 39.45; 50.80) 11.35 3.90 11.60) 13.50) 3.64 | 1.07 



































3. Though the school census shows a potential school population 
of Spanish-Americans as great as or greater than that of Anglo-Ameri- 


cans, not more than a third as many Spanish-Americans as Anglo- 
Americans are in high school. 
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4. Spanish-American sophomores, juniors, and seniors, and Anglo- 
American juniors and seniors were appreciably below the norms of 
The Inglis Tests of English Vocabulary. 


RECOMMENDATIONS 


On the basis of the interest manifest by the pupils when they com- 
pared their scores with the norms, a system of rather frequent testing 
whereby the pupils can check and compare their own progress is sug- 
gested as a means of stimulation. Projects in which each pupil builds 
in each subject a glossary of the words which he and the teacher feel 
are necessary to an understanding of the subject offer great possibilities. 
Such projects could be adjusted to the needs of the Spanish-American 
pupil. In schools in which there is a high per cent of Spanish-Ameri- 
cans, possibly that portion of the curriculum which could be changed 
without lowering standards too much, should be modified to include 
more racial customs and ideals of the Spanish-American people. 
Studies should be made to determine the relative progress of the racial 
groups, and experiments should be conducted to find the most useful 
methods and devices for effective vocabulary building. 
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THE SIMILARITY FACTOR IN TRANSFER AND 
INHIBITION 


BRANTLEY WATSON 


Duke University 


The similarity factor is recognized as one of the most important 
determining conditions of transfer and inhibition. Experimental 
evidence of the relationships involved has been derived largely from 
investigation of the nature of retroactive inhibition. The general con- 
clusions regarding the problem, drawn from an exhaustive survey of the 
available literature, are summarized by Britt: 


At the maximum of dissimilarity (in content, meaning, form, method of 
operation, environment, etc.) between the two activities, retroactive inhibition 
may occur. As the degree of similarity of one or all of these factors is relatively 
increased, the degree of retroactive inhibition also tends to increase. A certain 
point is eventually reached, however, after which increasing the degree of 
similarity results in more and more actual identity of the various factors; and 
from this point on, the amount of retroaction tends to decrease until at the 


upper limit, actual identity of all factors, there may be no inhibition at all but 
simply repetition. 


Although the general relations described above are supported by 
experimental data, most investigations have limited the degrees of 
similarity to subjective estimations of the variable, as is characterized 
by the early work of Robinson,‘ Skaggs,* and McGeoch.* The first 
comprehensive attempt at quantitative control of the similarity factor 


was made by Robinson’ in order to determine the validity of the law, 
originally stated by Skaggs: 


As similarity between interpolation and original memorization is reduced 
from near identity, retention falls away to a minimum and then rises again, 


but with decreasing similarity it never reaches the level obtaining with 
maximum similarity. 


His material consisted of a list of eight consonants exposed visually 
for five-tenths of a second, the last four of which he considered as the 
interpolated material and in which he inserted varying numbers of 
consonants identical to the first four. Recall was tested for the first 
four items of the series. 

Supplementing Robinson’s work, Harden? further varied the inter- 
polated material by inserting numbers into the last four items of the 
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series instead of other consonants. She concluded that Robinson’s 
results, supplemented by her own, substantiated Skaggs’ and Robin- 
son’s original predictions. 

No single experiment has been reported, however, in which the scale 
from maximum to minimum simila:‘ty has been so defined as to war- 
rant the determining of quantitative relationships throughout the scale. 
Furthermore, in no study has the attempt been made to evaluate the 
variables from the standpoint of both transfer and inhibition. It is the 
purpose of this article to present the results from such an investigation 
and to suggest certain theoretical implications derived from an analysis 
of the data. 

The learning activity of the experiment involved sorting decks of 
cards into various patterns. The material consisted of an upright box 
containing sixteen compartments, beneath which were numbers cor- 
responding to numbers on the cards to be sorted. Only two-place 
numbers were used (twelve to twenty-seven inclusive). The subjects, 
taken individually, were seated before a table on which the box was 
placed. Instructions given to each of the subjects were as follows: 


You will be given a deck of cards to sort into these compartments. The 
numbers on the cards correspond to the numbers beneath the compartments. 
This is an experiment to determine how quickly you can learn where to put 
each card. When I say “go,” turn the deck over so that you can see the 
numbers and begin at once. [A deck of cards, upside down, was given the 
subject.] As soon as you have sorted this deck I shall give you another one 
just like it. Always wait until I say ‘‘go” before you start to sort a new 
deck. Do you understand what you are todo? Ready, go. 


There were eighty cards, five for each of the sixteen compartments, 
in each deck. None of the first sixteen cards sorted were duplicates, so 
that a record of the time required to sort one card into each compart- 
ment could be taken in addition to the time required for sorting the com- 
plete deck. Sorting time was checked with a stop-watch. As soon as 
the first deck was sorted, the subject was given another deck and 
another box similar to the first one. Each subject sorted ten decks, or 
fifty cards into each compartment, for original learning. A record was 
taken of the time required to sort each deck. Also, in addition to a 
record for the first sixteen cards of the first deck, a record was taken of 
the time required to sort the last sixteen cards of the last deck, none 
of the last sixteen cards being duplicates. At the end of the tenth 
sorting, the subject was told that he had completed this part of the 
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experiment and that he might rest for a few minutes in an adjoining 
room. Each of the subjects spent the time reading a humorous maga- 
zine which had been especially provided. During the interval (three 
minutes) the experimenter rearranged the pattern of numbers on the 
boxes. The subject was then asked to sort the new pattern as quickly 
as possible. ‘Time records were taken as before for sorting ten decks 
of cards into the interpolated pattern. At the end of the tenth sorting 
there was another interval of three minutes, after which the subject 
was timed in sorting one deck of cards into the original pattern. Each 
subject was instructed that the final pattern was the same as the original 
one. 

Different patterns were interpolated after the original learning for 
different groups of subjects. These were varied as follows: For Group 
I the interpolated pattern was the same as the original one. For Group 
II four of the numbers were in different positions (the same numbers 
were used in the interpolated as in the original pattern) ; for Group III, 
eight; for Group IV, twelve; and for Group V all sixteen numbers were 
in a different position. For a sixth group the interpolated pattern was 
similar to that for Group V except that four of the numbers were 
replaced by letters. In the deck to be sorted for this pattern a similar 
change was made. For Group VII eight of the numbers were replaced 
by letters; for Group VIII, twelve; and for Group IX letters were sub- 
stituted for all of the numbers. For a tenth group of subjects the 
interpolated activity consisted of taking an intelligence test for thirty 
minutes, this representing approximately the time required for the 
interpolated learning of the other groups. The choice of an intelligence 
test was not made for purposes of measurement, but merely to occupy 
the subject with an activity which would prevent his thinking about the 
original learning. 

An attempt was made to get at a subjective evaluation of the condi- 
tions of the experiment by having two subjects from each group “think 
out loud”’ while sorting the cards, by intensively questioning the sub- 
jects as to the nature of their reactions, and by taking notes during the 
process of sorting on the nature of the observable motor reactions. 

The subjects were sixty men and women students of Duke Univer- 
sity, the results from ten of whom were discarded because of inability 
to equate the subjects in any of the experimental groups. The groups, 
consisting of five subjects each, were made as nearly comparable as 
possible on the basis of the original sorting time. Each group was 
composed of one subject whose time for sorting the first deck was within 
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the range 5:00-5:30 (five minutes to five minutes and thirty seconds), 
one whose time was from 4:30-5:00, two whose time was from 4:00— 
4:30, and one whose time fell in the range 3:45-4:00. After the nature 
of the original learning curve had been established, subjects were 
further equated as nearly as possible for the various groups on the basis 
of their individual learning curves. 

For each group was calculated (1) per cent of transfer from original 
to interpolated materials and (2) per cent of retroactive inhibition for 
each of the interpolated patterns. Data were obtained on the basis 
both of time required to sort complete decks and of time required to 
sort sixteen cards of a given deck. Since the data for both types of 
measure show the same general trends, and since the slight differences 
which do exist are in all probability caused by the small amount of 
relearning involved in the former, only the latter will be considered here. 
If A! represents the time required to sort the first sixteen cards of 
the first deck, A’ the last sixteen cards of the tenth deck, B' the first 
sixteen cards of the first deck of the interpolated material, B® the last 
sixteen cards of the tenth deck, and A"! the first sixteen cards of the 
original material following the interpolation, then the angeee 
sequence of the experiment can be represented as follows: 


Al... A (three minutes) B! . . . B™ (three minutes) A". 


In computing the various measures, A! . . . A! is considered as a 
scale on which B! is placed for determining the amount of transfer. 
Thus, A!...B!...A™. Thedistance A! ... B' represents the 
amount of transfer (how much faster B! was sorted than A‘), and the 
per cent of transfer is calculated as a ratio of this distance to the total 
range. The formula, then, is: 

ome Bi 
Ai — Ao 





per cent transfer = 


Similarly, the per cent of retroactive inhibition was calculated on 
the basis of the original learning range, and the formula becomes: 
All — Ale 
Ai — Ao. 





per cent retroactive inhibition = 


Here one hundred per cent retroactive inhibition would represent a 
condition in which the sorting of the original pattern (A!") following the 
interpolated pattern would require as much time as the first sorting of 
the original pattern (A‘). Zero per cent retroactive would represent a 
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condition in which A!! was the same as A, It must be borne in mind 
that the calculation of retroactive inhibition as made above does not 
represent absolute amounts. Such would be the case only if the last 
sorting of the original material represented complete learning. For the 
purposes of this study, however, measures of relative effects, not scaled 
to an absolute zero, are entirely satisfactory. 

In Table I are presented the objective data for all groups. 


TaBLE I.*—Errect UPON TRANSFER AND INHIBITION OF QUANTITATIVE VARIATION 
IN SIMILARITY BETWEEN ORIGINAL AND INTERPOLATED PATTERNS 














Sorting time Per cent Per cent 

Group tonaten retroactive 

Al Ale Bi Bre Al inhibition 
I 1.07 .34 .35 .30 31 97.0 —9.1 
II 1.07 .28 45 .28 34 56.4 15.4 
Ill 1.10 .28 .48 .30 .45 52.4 40.5 
IV 1.09 .30 1.08 .30 .57 2.5 69.2 
V 1.06 .33 1.08 .35 1.05 —6.1 96.9 
VI 1.06 .32 1.05 .28 1.04 3.0 93.6 
VII 1.04 .33 1.01 .29 .51 9.7 58.1 
VIII 1.08 .33 1.05 .26 47 8.6 40.0 
IX 1.06 34 1.03 .25 .42 9.4 25.0 
x 1.05 31 Intelligence test 3 eee 25.8 























* That the experimental groups were very closely equated can be seen from 
inspection of the first column of the table. 


In order to follow the temporal sequence of the experimental set-up, 
an analysis of the data for transfer will precede that for retroactive 
inhibition. In Fig. 1 the data for transfer are shown graphically. 

Inspection of the curve reveals that with increasing increments of 
dissimilarity between original and interpolated patterns there is a 
decrease in transfer to Pattern V (Group V) at which point a slight 
amount of interference or negative transfer is obtained. With increas- 
ing insertion of letters into the patterns the curve rises to a point 
slightly above the line distinguishing positive from negative transfer. 
The slight amount of transfer found for Groups VI, VII, VIII, and IX 
is attributed to the constant practice-effect of the sorting procedure 
itself. Compensation for this factor would tend to decrease the 
observable transfer effects throughout the curve and to increase the 
interference but would not alter the shape of the curve, which is 
the important factor in this study. 
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It is significant that increase in transfer is not directly proportional 
to quantitative increase in similarity. That the leveling of the curve 
between Patterns II to III and IV to V is not merely a chance variation 
is suggested by an analysis of notes taken on “‘what the subject was 
thinking.’ The initial drop to Pattern II is accounted for by remarks 
of subjects in Group II similar to the following: 

“Oh, you’ve changed them. That’s mean. But the twenty-three 
is in the same place. Tricky, aren’t you? Now where was three 
anyway? That number you changed messed up the whole thing.” 
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Figure l.~Per Cent of Transfer from Orizine! to Interpolated Patterrs 


The conclusion is drawn that the initial disruption is not directly 
proportional to the quantitative variation in similarity; that the 
temporary disruption caused by changing four of the numbers was 
only slightly increased by changing as many as eight. 

Similarly, the leveling of the curve between Patterns IV and V 
seems to be a reliable variation. In fact, two of the subjects in Group 
IV did not realize during the sorting of the first sixteen cards that any 
of the numbers were in the same place, but recognized the fact only 
‘along about the middle of the deck.” 

These observations indicate that the phenomenon of transfer 
involves more than the objective similarity of individual ‘‘elements”’ of 
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a pattern. A cue to the explanation is given by analysis of the overt 
motor reactions. In many instances the subjects actually put the 
cards in the interpolated pattern where they had learned to put them in 
the original pattern. Spontaneous correction was usually made by the 
subject himself, otherwise his attention was called to the fact by the 
experimenter. Much more frequent were partial responses in which 
the subject’s hand moved in the direction of the compartment in which 
the number had originally been sorted. This type of response per- 
sisted throughout Patterns II and III, but it is of significance that the 
response was rarely noticeable in Patterns IV to IX after the first 
sorting. ‘This indicated that in the more similar material there was a 
transfer from original to interpolated patterns not only of the individual 
elements of the pattern, but also of structure and organization of the 
pattern as a whole. With increasing dissimilarity of the patterns this 
organization was quickly disrupted, the interpolated material represent- 
ing a relatively new learning situation. This conclusion is supported 
throughout the exper.ment by the subjects’ introspective reports. It 
will be further illustrated in connection with the data for retroactive 
inhibition. 

In Fig. 2 per cent of retroactive inhibition for each of the inter- 
polated patterns is shown graphically. 

With complete similarity between original and interpolated patterns 
there is, as would be expected, a slight amount of ‘“‘ retroactive facilita- 
tion.”’ From this point there is a consistency of increase in increments 
of retroaction with identical quantitative increase of increments of 
dissimilarity. There is no leveling of the curve between Patterns II to 
III and IV to V as was found in per cent of transfer. This leads to the 
conclusion that temporary disruption of the learning pattern, caused 
by the insertion of the first four elements of the dissimilar arrangement 
(Pattern II), is relatively less effective when practice is continued on a 
large part of the original pattern. Similarly, the continued rise in the 
retroactive inhibition curve between Patterns IV and V indicates that 
complete disruption of the learning pattern, caused by changing twelve 
of the numbers of the original pattern, is relatively compensated for 
when practice is continued on even a small part (four of sixteen 
elements) of the original pattern. Inspection of the latter part of the 
curve warrants the following conclusions: (1) The absence of any con- 
siderable drop between Patterns V and VI indicates that the insertion 
of four letters into the interpolated pattern does not appreciably dimin- 
ish the damaging effects of the pattern upon the original learning. (2) 
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The consistent drop from Pattern VI to Pattern IX indicates that if 
half or more of the interpolated pattern consists of entirely dissimilar 
material, there is considerably less disruption of the original learning, 
the decrease in damaging effects being roughly proportional to the 
quantitative decrease in similarity. (3) The very slight inversion 
of the curve from IX to X indicates, if the difference is at all significant, 
that the practice-effect of the sorting itself is operating throughout the 
curve, as was suggested by the transfer data. In Group IX the inter- 
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polated material consisted entirely of letters. In Group X the inter- 
polated material was an intelligence test and did not involve card-sorting 
at all. Compensation for this factor would tend to increase retroaction 
throughout the curve, but would not alter the general shape of the 
curve. 

The above analysis has been made on the basis of objective similar- 
ity of individual elements of the patterns. Although the curve for 
retroaction represents the objective relationship between quantitative 
mpage of similarity and inhibition, it in no way defines the types 
of mental processes for the various groups. These can best be under- 
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stood through an analysis, such as that made for transfer, of the non- 
quantitative data. It was suggested by subjects in Groups IT and IIT 
that in sorting the interpolated pattern they were able to ‘“‘fit the 
changes into the original pattern,” and that when they returned to the 
original pattern they could remember almost perfectly where the origi- 
nal numbers had been. Two subjects from Groups VIII and IX 
reported that while they were sorting the letters they were associating 
them with the numbers which had originally been on the various com- 
partments and that when they returned to the numbers they very often 
thought of the corresponding letters. These data with others similar 
to them suggest that the elements of a pattern are not learned entirely 
as individual responses. They further suggest that some subjects did 
not learn the original and interpolated patterns as separate units at all 
but organized each into a more comprehensive pattern which involved 
the two. 

This suggestion is supported by notes taken from an analysis of 
overt partial responses. In every instance where the interpolated 
pattern contained numbers in different positions, there was an observa- 
ble tendency for the subjects initially to respond to the position in 
which the numbers originally had been learned. In Groups IV, V, and 
VI these responses dropped out entirely after the second sorting, 
indicating that the original organization was completely disrupted. 
This was not the case with Groups II, III, VII, and VIII. In sorting 
the interpolated pattern, two distinct overt responses were made to each 
number which appeared in a different position. A partial response was 
first made to the position where the number formerly had been. This 
was followed in the initial stages by a random search for the new posi- 
tion. As learning of the new position progressed, the purely random 
movements following the initial response developed into a direct move- 
ment in the direction of the new position. It is an interesting fact, 
however, that the initial partial responses in some subjects were never 
entirely eliminated. What is even more striking, when the original 
pattern was again sorted, the complete movement described above 
persisted forashort time. That is, the subject’s hand would first move 
in the direction of the correct position but would then start to move in 
the direction of the interpolated position, which resulted in a necessary 
“backing up.” This type of response, however, was very quickly 
eliminated. 

These data serve as a reliable cue for certain theoretical consider- 
ations. In some subjects the learning of the interpolated material 
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was superimposed upon the original learning and the resultant learning 
was a combination of original and interpolated patterns. It is to be 
observed that the type of response described above was nowhere 
noticeablein GroupsIV,V,and VI. Itis assumed that the patterns are 
here sufficiently dissimilar to the original pattern to disrupt the 
organization completely. In Patterns II and III a large part of the 
original learning was actually repeated in the interpolated material. 
It is suggested that after the original disruption, caused by changing 
four and eight elements of the pattern respectively, the pattern is 
quickly reorganized, not as an entirely new pattern, but as a part of a 
larger organization involving the original organization as well as the 
changes which must be made. In the interpolated patterns VII and 
VIII no part of the original pattern is repeated. This suggests that 
with increasing amounts of dissimilar material (letters) in the inter- 
polated pattern there is a corresponding decrease in the disruption of 
the original organization. 

These observations lead to the conclusion that the factor of organiza- 
tion specifically determines the conditions of transfer and inhibition. 

Amount of retroactive inhibition has been consistently interpreted 
in the experimental literature as a function of the similarity factor. 
Moreover, the assumption has been implied that the similarity factor 
was operating as a sole causal factor. The data presented in this paper 
are inconsistent with such an assumption. If we consider similarity 
entirely in terms of materials as did Robinson and Harden, the first 
five patterns are entirely similar (that is, they involve the same two- 
place numbers). With the insertion of letters into the patterns similar- 
ity decreases until with Pattern IX (all letters) there is complete 
dissimilarity. If we consider similarity of arrangement of patterns, 
similarity decreases to Pattern V, after which the patterns are all 
completely dissimilar. If similarity is considered as a function of the 
two variables, there is a continuous decrease from Pattern I to Pattern 
IX. The curve for retroactive inhibition, as found in this study, 
corresponds in direction to none of these theoretical relationships. In 
none of the similarity curves would be found the inversion characteristic 
of the curve for per cent of retroaction. Obviously, although we may 
describe our results in terms of amount of similarity between original 
and interpolated patterns, we cannot explain them entirely on this basis. 
It is necessary to postulate another factor which will satisfy the cause- 
effect relationships. The factor of compatibility is here offered as an 








~ aa ht © heed —_ ~~ ir 


- er _—" a> a as. ete «a4 Bra ~ 





The Similarity Factor in Transfer 155 


explanatory principle.* Fortunately the experimental set-up offers 
convincing evidence of its operation. In this experiment variation of 
the similarity factor has been limited to a scale extending from com- 
plete similarity of materials (the same numbers in both original and 
interpolated patterns) to complete dissimilarity of materials (numbers 
and letters). Thus throughout the first five patterns there is complete 
“similarity.”” From Patterns VI to IX similarity is quantitatively 
decreased by increments of twenty-five per cent with the insertion of 
each additional four letters into the interpolated pattern, until with 
Pattern IX there is complete dissimilarity. The factor of compatibil- 
ity differs from similarity, however, in the relationships expressed 
above. With Pattern I there is complete compatibility between the 
original and interpolated patterns (identity). With Pattern V there is 
complete incompatibility. That is, each number of the original pattern 
must, in the interpolated pattern, be associated with an entirely 
different overt response. Variation of objective compatibility between 
Patterns I[ and V is in direct proportion, quantitatively, to the amount 
of variation in the interpolated pattern. With Pattern IX there is 
again complete compatibility; that is, learning Pattern IX is not 
dependent upon “‘unlearning”’ the original pattern. The relationships 
expressed above are shown graphically in Fig. 3, together with the 
experimental curve for retroaction which has been inverted for pur- 
poses of comparison. As such, the curve can be interpreted as a meas- 
ure of retention. 

Comparing the curves for similarity and compatibility, three points 
are located on the base-line: Pattern I, complete similarity and complete 
compatibility; Pattern V, complete similarity and complete incom- 
patibility; Pattern IX, complete dissimilarity and complete com- 
patibility. The relative effects of these possible relationships upon 
retention can be measured directly from the empirical curve. From 
Pattern I to Pattern V similarity is held constant and compatibility is 
varied ; the experimental curve corresponds almost exactly to quantita- 
tive decrease in compatibility. From Pattern V to Pattern IX similar- 
ity decreases and compatibility increases; the corresponding curve for 
retention rises with increasing compatibility, but, with decreasing 





*No attempt is made here to identify compatibility psychologically. It is 
assumed that it is roughly analogous to muscular antagonism. The writer is 
indebted to Dr. Howard Easley of Duke University for suggesting the use of the 
concept in interpreting these data. 
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similarity, it never reaches the level obtainable with perfect similarity 
and perfect compatibility. The relationships for Group X are shown 
for comparison with a different kind of entirely dissimilar material. 

From these data it would seem that the effect upon retention of 
the three possible relationships between similarity and compatibility 
are as follows: 

1. High similarity, high compatibility-high retention. 

2. High similarity, high incompatibility-low retention. 

3. Low similarity, high compatibility-relatively high retention, 
but not so high as under condition 1. 
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Figure 3.— Relation Between the Curves for Similarity (Dotted Line), 
Compatibility (Broken Line), and Retroactive Inhibition 


The preceding analysis has been made on the basis of data for retro- 
active inhibition converted into “retention scores.’’ Similarly, an 
analysis of the data for transfer reveals the necessity of postulating, in 
addition to the similarity factor, the factor of compatibility. On 
page 157 are shown the various possible relationships between the two 
factors, together with the actual and total observable effects. 

It has been stated before that quantitative conclusions cannot be 
interpreted solely on the basis of individual elements of a learning 
pattern, that the factor of organization must be considered. What, 
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then, is the relation between organization and the variables of similarity 
and compatibility on the one hand, and transfer and inhibition on the 
other? A solution to the problem has been suggested in the analysis 
of the non-quantitative data. Learning to sort sixteen cards into 
various compartments involves to some degree, in addition to the 
individual elementary associations formed, the organization of these 
elements into a unitary pattern. With the insertion of the interpolated 
material, a new pattern is formed. The degree to which the elements 











thee — Total observ- 
Similarity | Compatibility Effects - able effect 
High High Large positive transfer. Small, if | Facilitation 
any, negative transfer (inhibition) 
High Low Small, if any, positive transfer. | Interference 
Large negative transfer. 
Low High Small, if any, positive transfer. | None 
Small, if any, negative transfer. 
Low Low This condition is experimentally impossible. 











of the original pattern can become a part of this new organization 
determines the amount of positive or negative transfer. This in turn 
determines the amount of retroaction and retention. The factors of 
similarity and compatibility are suggested as determining the degree to 


which elements of the original pattern can become a part of such an 
organization. 
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BOOK REVIEWS 


K. E. OBERHOLTZER. An Integrated Curriculum in Practice. New 
York City, Teachers College Contributions to Education, No. 694, 
1937, pp. XV + 218. 


The author essayed a difficult task when he tried to evaluate a 
curriculum from its effects on the pupils. The experiment was carried 
out in Houston, Texas, the subjects being two thousand pupils in 
Grades IV and V. From these, complete records of sixteen hundred 
sixty-two pupils were secured. The pupils were divided into three 
groups, A, B and C; A and B being experimental groups and C a control 
group. Group A was freed from all restrictions in teaching the new 
curriculum; Group B was restricted as to time distribution; and Group 
C used the old curriculum. The groups were tested at the beginning 
by numerous tests and their progress measured by the improvement of 
scores on the New Stanford Achievement Tests, West’s Handwriting 
Scale, and Baker’s ‘‘What I do.” Information tests based on the 
specific units taught were also given: The results showed that the 
pupils using an integrated curriculum did as well in fundamental 
skills as those using a more formal one. In some respects they did 
better. The study seems to have been carefully conducted and con- 
trolled. However, there was one thing that could not be controlled; 
namely, the enthusiasm of the author for the new curriculum. It is 
interesting to speculate as to what would have been found if the teach- 
ers of Group C had been the white-headed boys instead of those who 
taught Groups A and B. PETER SANDIFORD. 

University of Toronto. 


Epwarp M. WestsurGH. Introduction to Clinical Psychology. 
Philadelphia: P. Blakiston’s Son & Co., 1937, pp. XIII + 336. 


In the first paragraph of his preface, Dr. Westburgh makes his first 
error when he says that, ‘“‘students of psychology and medicine have no 
access to this accumulating knowledge of facts, principles, and 
methods” in clinical psychology. This was not true even at the 
time the book was written. Further on he says that this book is an 
attempt at a formulation of the “principles and philosophy of Clinical 
Psychology” and implies that it summarizes the facts and methods. 

It is impossible to demonstrate in a short review that neither 
methods, facts, nor principles are adequately dealt with. The book is 
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largely devoted to methods. In an appendix of seventeen pages, 
printed in six-point type, is an elaborate analytic “‘Outline for the 
Clinical Study of Personality.”” Five ofthe eleven chapters are devoted 
to discussion of various sections of this outline and an appreciable 
portion of four other chapters coincides closely with it. The descrip- 
tion and evaluation of information desirable in the history—to which 
four chapters are devoted—is adequate. In the chapter on “Test 
Results” there are praiseworthy cautions concerning the interpretation 
of 1Q’s, but at the same time the author says “‘IQ’s are regarded as 
having approximately the same meaning for all tests.””’ He devotes a 
page and a half to calculation of adult [Q’s with sixteen as the divisor in 
spite of the changes in practice that started at least twenty years ago. 
Also in this chapter there is a disproportionate amount of space devoted 
to elementary statistics. 

In a clinical textbook one could reasonably expect to find a syste- 
matic discussion of the different kinds of problems encountered. 
Nowhere in this book is there any such discussion. There are two 
chapters on ‘‘Cognitive Factors” in which feeblemindedness is con- 
sidered in the midst of an incoherent presentation of the theory of 
intelligence. In one of these chapters (p. 72) we find the statement, 
“As long as a person can keep out of a feebleminded institution, a 
prison, or a hospital for the insane, he cannot be considered feeble- 
minded, or criminal, or insane.”! In two chapters on “ Affective 
Factors” there are some brief statements concerning the sympto- 
matology of certain mechanisms and neuroses, but nothing more than 
passing mention of a few typical problems. 

Finally, there are the principles of clinical psychology. These are 
apparently presented in the first and last chapters and, incidentally, 
throughout the book. In the first chapter nine of sixteen pages are 
devoted to a reprinting of Freud’s article on psychoanalysis from the 
Encyclopedia Britannica. Elsewhere in the book, however, the author 
apparently doesn’t consider psychoanalysis very adequate. The 
last chapter on “‘Some Fundamental Concepts of Clinical Psychology”’ 
repeats some things that have been said, discusses fields in which clinical 
psychology may be used, and has a rather good summary of psycholog- 
ical types of therapy. 

There is a selected bibliography which the author says is “‘compre- 
hensive enough for any interested student to use as a beginning for the 
study of advanced clinical concepts and branches of science related to 
clinical psychology.” This bibliography does not include any of the 
published books on clinical psychology and does not mention a single 
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book of cases, such as those published by the Commonwealth Fund. 
Surely these are serious lacks. 

Like so many books this one does not fulfill the promise of the 
preface. It can scarcely be considered even an introduction to clinical 


psychology. C. M. Lovrtir. 
Indiana University. 


JOHN EDWARD BENTLEY. Problem Children. New York: W. W. Nor- 
ton and Company, 1936, pp. 442. 


Thanks to the influence of the mental hygiene movement the 
words ‘‘problem children” often carry the meaning of behavior prob- 
lems in children as studied and treated in mental hygiene clinics. This 
implies to many an interest in the individual approach to the study of 
the child by way of the integrated team-work of psychologists, psy- 
chiatrists, social workers, and educators. Bentley’s book on Problem 
Children is not such a book. In a sense the name is a complete mis- 
nomer. The content of the book is based on courses given by the 
author in the education of handicapped children and that is what the 
book is about—the physically, the socially, the psychologically 
handicapped. 

Approximately one-third of the volume is concerned with physical 
disabilities. Here one can find statistical facts about the physical 
defects of children in our public schools; some facts about crippled 
children in the schools; facts about visual, auditory, and motor disabili- 
ties; lists and brief descriptions of tests used in examining all these 
defects, and suggestions for a health program for the physically 
handicapped in our schools. Intellectual, social, and educational 
disabilities of the children in our schools are similarly but not as com- 
prehensively treated. The description, for example, of the educational 
disabilities is limited to reading difficulties and a remedial treatment for 
reading. Part Three, wherein is considered the social disabilities of 
children, is concerned with a factual description centered about Wick- 
man’s book on Children’s Behavior and Teacher’s Attitudes; a brief 
consideration of juvenile delinquency as a behavior problem, and a 
chapter on the child guidance clinics in the public schools. 

The book, then, contains relatively little that will be of use to 
clinicians or mental hygienists, but it will be useful to school people who 
want an outline in short precise form on information about handicapped 
children, tests to use for examining their disabilities, and the methods 
to use in treating them. H. MELTZER. 

Psychological Service Center, St. Louis. 
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